Hi all,

Im really stuck with this one so any help would be appreciated.

I have the expression data for 28 different tissues. The matrix I create will look something like below but with 28 tissues and around 50k rows:

```
tissue1 tissue2 tissue3 .....etc....
gene1
gene2
gene3
etc..
```

I'm going to use limma's "normalizeBetweenArrays()" function to quantile normalise the data. I cant figure out whether I should be filling the matrix with the raw rpkm values or the log2 normalised values for entry into the limma function. Which one should it be?

EDIT: I do get how quantile normalisation works, but I just dont know whether it is correct to use it on log2 values. I have read some resources on this hwoever no one is clear about what the input is.

Thanks,

Kenneth

Look at WIKI-link:

https://en.wikipedia.org/wiki/Quantile_normalization

Without theory see this recent paper as an example:

Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702322/

Also see thist post and articles mentioned in the bottom:

Normalization Of Gene Expression Using Rnaseq Rpkm Values

OP didn't ask for an explanation of quantile normalisation... sharing links can be helpful, but these don't appear to be specifically about this question. If the answer to his question is somewhere on those pages, why don't you give the answer and refer to the pages for further explanation?