Updates on 2022 data entry, 2021-2022 Analyses; and preparing for Multi-species data.
Short-term goals
Before I get into the diffferent updates, here’s my short-term goals leading up to my trip to Quadra Oct 19-24 for our team writing retreat:
- Transfer Multi-species raw RNAseq data to OWL
- Finish analyses for 2021/2022 work and have solid results
Goals for when at Quadra:
- Write paper based on the newest results for 2021/2022 work that I will have done before Oct 19!
2022 Data Entry
I have some things to go back and fix now that the data entry is complete and Alyssa, Melanie, and I discussed things.
First, I went through what Alyssa and Melanie entered and answered any questions they had - they were all related to not being able to discern what the dates or times were on the scanned data sheets, so I fixed them since I have the hard copies.
Then, I went through and adjusted things we discussed in our meeting yesterday:
- Go back through and assign disease scores per our categorizations we came up with since 2022
- leave data related to twisting arms after we removed arms…
- go back and mark time points when stars were sampled - in 2022 we kept the logs separate, but it will be nice to have all the information in one big datasheet!
Summer 2021/2022 Alignment
Summer 2021 Matrices
Completed code through creating gene and transcript count matrices for summer 2021.
Code: 20-hisat2-genecount-matrices.Rmd
Matrices:
- Genes: paper-pycno-sswd-2021-2022/data/gene_count_matrix_2021.csv)
- Transcripts: paper-pycno-sswd-2021-2022/data/transcript_count_matrix_2021.csv
Tomorrow:
Re-run through DESeq2
to get DEG lists. Have all the code already! Will just use the newest gene count matrix linked just above.
Summer 2022 Matrices
Creating matrices for 2022 RNAseq data in the same R code - 20-hisat2-genecount-matrices.Rmd. Files are in analyses/20-hisat2/2022 directory.
pycno2022 trimmed data is in: /home/shared/8TB_HDD_02/graceac9/data/pycno2022
on Raven
moving it into the data/2022_trimmed directory in this repo:
rsync --archive --progress --verbose PSC*fq.gz graceac9@raven.fish.washington.edu:/home/shared/8TB_HDD_02/graceac9/GitHub/paper-pycno-sswd-2021-2022/data/2022_trimmed
run rsync
while in the /home/shared/8TB_HDD_02/graceac9/data/pycno2022
directory.
Running the alignment as of 2024-10-10 1:25pm.
Alignment complete.
31/32 libraries have alignment rates 73-80%. One library has 64% alignment rate.
Currently running conversion of sam to bam files.
Multi-species prep
I was given log-in information for accessing the raw RNAseq files… but the portal still says the data is being uploaded… but tomorrow I’m going to try logging in and transferring the files if they are fully uploaded!!
Started looking into literature for examples of comparative transcriptomics across different species.
So far just have skimmed abstracts and titles… we’ll see what’s useful.
https://onlinelibrary.wiley.com/doi/full/10.1111/j.1365-313X.2012.05005.x
https://genome.cshlp.org/content/30/7/951.full.pdf+html
https://onlinelibrary.wiley.com/doi/epdf/10.1002/jez.b.22618
https://www.nature.com/articles/nrg.2017.19
https://academic.oup.com/nar/article/49/D1/D831/5920517#google_vignette
https://link.springer.com/article/10.1186/s12864-016-3080-9
https://link.springer.com/article/10.1186/s12864-015-1551-z
https://academic.oup.com/ismej/article/8/10/2056/7582424
https://academic.oup.com/gigascience/article/6/12/gix105/4628125
https://www.pnas.org/doi/abs/10.1073/pnas.1819657116