The gory details – experiment design guide for beginners

In the previous post I’ve outlined the strategy that will allow beginner scientists to understand the complexity of a research project – something that they often struggle with. From the “tree of life” view of the research project there is a rather straightforward path to understanding the significance and interpreting the results of each experiment that they perform. I recommend that each beginner scientist answers (ideally in writing) the following questions before they even touch a pipette:

  1. What is the goal of this experiment?
  2. What is the hypothesis?
  3. What is the approach?
  4. What are the experimental groups and controls?
  5. What are the expected results?

  1. What is the goal of the experiment?

This question is equivalent to “what is the next step in my project that this experiment will enable”? Some experiments will be “final”, so they will simply enable you to compose a figure and make a statement about the phenomenon under study. They are the experiments at the branches of the “tree of life” immediately attached to the root or perhaps one step removed from the root. Most experiments, however, are somewhere in the more “leafy” part of the tree of life, so they  will usually enable us to take the next step towards the root. We must know this next step. This is absolutely crucial to experiment design. If you don’t know what the next step is, put the pipette away and go back to the basics. Maybe you are doing an experiment that is a dead end – don’t waste your time.

  1. What is the hypothesis?

A hypothesis is not the same as the goal. Sometimes the goal will be to test the hypothesis and will not have the “next step”, but more often the two will be related, but independent. The hypothesis will always be a statement that can be true or false. Again, if you don’t know what the hypothesis is, don’t touch your pipette. Here, I would like to add a caveat: there are two types of experiments that do not have a clear-cut hypothesis. One is a “fishing expedition”: here instead of testing the hypothesis, you produce a list. Examples would be co-IP/MS experiments where you get a list of interacting proteins, RNASeq experiments where you get a list of differentially expressed genes, etc. The second type of experiment where you don’t have a hypothesis is a “tool generation” or “method optimization” experiment. For instance, you make a specific DNA construct or you test many different PCR reaction conditions. These experiments have a very clear goal, but no clear hypothesis.

An experiment can have multiple hypotheses, for instance you may be testing a few candidate genes for their functional relevance. The function of each gene will be a separate hypothesis. In this case, make sure you take this into account in statistical analyses (correction for multiple hypothesis testing).

  1. What is the approach?

This is the methodology used – from specific manipulations of the system, through harvesting of material for analysis, analysis methods, and finally statistical processing of the results. Pretty straightforward, but it is always good to have a clear idea of what it will take to accomplish the goal of the experiment from start to finish. Many an experiment failed because the hapless experimenter forgot that they needed a tool/piece of equipment/reagent that they did not have at hand.

  1. What are the experimental groups and controls?

I cannot stress that enough. Every experiment that has a hypothesis, and most that don’t have one, requires a negative control. An experiment without a negative control is as good as no experiment at all. It is the negative control that allows you to test the hypothesis, allows you to narrow the list in a “fishing expedition”, helps you decide if a tool generation was successful. If you run out of reagents/cells/etc. it is the negative control that must be analyzed, even at the expense of some experimental groups. A good negative control is a sample that is as close to the experimental groups, as possible. So if you are doing siRNA experiments in a cell line, don’t use untransfected cells as negative controls – use cells transfected the same way as experimental group cells, but using a non-targeting siRNA instead of the specific one. Sometimes there are multiple possible negative controls – using many different ones will make your results more trustworthy and reliable. If your experiment has a well-defined hypothesis, make sure that the controls allow you to test this hypothesis – if they don’t, go back to the drawing board. Positive controls are not as important as negative ones, but they often help ensure that the experiment is working as expected and help to troubleshoot if it is not.

  1. What are the expected results?

In this section, you should describe how the results would look (more or less) if your hypothesis were true. This will help you interpret the results once you have them handy. If your results are as expected, you are golden. If they are not – maybe there is something wrong with your experiment (look at positive controls) or with your hypothesis.

I think that this 5-question framework will help any beginner make sure that they know what they are doing and aid them in experimental design. I think it is also a very useful way to look at figures in papers. Oftentimes beginner researchers look at a panel in the figure and don’t really know what it’s about. Once they answer these 5 questions, they are much more likely to understand where the experiment fits and how to interpret it. The answers to the questions are usually implicitly or explicitly stated in the text and figure legend, so seek and you shall find. I hope you will find this helpful – please make sure to comment if you have feedback or suggestions.

Leave a Reply

Your email address will not be published. Required fields are marked *