Updates on 2022 data entry, 2021-2022 Analyses; and preparing for Multi-species data.

Short-term goals

Before I get into the diffferent updates, here’s my short-term goals leading up to my trip to Quadra Oct 19-24 for our team writing retreat:

  • Transfer Multi-species raw RNAseq data to OWL
  • Finish analyses for 2021/2022 work and have solid results

Goals for when at Quadra:

  • Write paper based on the newest results for 2021/2022 work that I will have done before Oct 19!

2022 Data Entry

I have some things to go back and fix now that the data entry is complete and Alyssa, Melanie, and I discussed things.

First, I went through what Alyssa and Melanie entered and answered any questions they had - they were all related to not being able to discern what the dates or times were on the scanned data sheets, so I fixed them since I have the hard copies.

Then, I went through and adjusted things we discussed in our meeting yesterday:

  • Go back through and assign disease scores per our categorizations we came up with since 2022
  • leave data related to twisting arms after we removed arms…
  • go back and mark time points when stars were sampled - in 2022 we kept the logs separate, but it will be nice to have all the information in one big datasheet!

Summer 2021/2022 Alignment

Summer 2021 Matrices

Completed code through creating gene and transcript count matrices for summer 2021. Code: 20-hisat2-genecount-matrices.Rmd
Matrices:

Tomorrow: Re-run through DESeq2 to get DEG lists. Have all the code already! Will just use the newest gene count matrix linked just above.

Summer 2022 Matrices

Creating matrices for 2022 RNAseq data in the same R code - 20-hisat2-genecount-matrices.Rmd. Files are in analyses/20-hisat2/2022 directory.

pycno2022 trimmed data is in: /home/shared/8TB_HDD_02/graceac9/data/pycno2022 on Raven moving it into the data/2022_trimmed directory in this repo:

rsync --archive --progress --verbose PSC*fq.gz graceac9@raven.fish.washington.edu:/home/shared/8TB_HDD_02/graceac9/GitHub/paper-pycno-sswd-2021-2022/data/2022_trimmed

run rsync while in the /home/shared/8TB_HDD_02/graceac9/data/pycno2022 directory.

Running the alignment as of 2024-10-10 1:25pm.

Alignment complete.

31/32 libraries have alignment rates 73-80%. One library has 64% alignment rate.

Currently running conversion of sam to bam files.

Multi-species prep

I was given log-in information for accessing the raw RNAseq files… but the portal still says the data is being uploaded… but tomorrow I’m going to try logging in and transferring the files if they are fully uploaded!!

Started looking into literature for examples of comparative transcriptomics across different species.

So far just have skimmed abstracts and titles… we’ll see what’s useful.

https://onlinelibrary.wiley.com/doi/full/10.1111/j.1365-313X.2012.05005.x

https://genome.cshlp.org/content/30/7/951.full.pdf+html

https://onlinelibrary.wiley.com/doi/epdf/10.1002/jez.b.22618

https://www.nature.com/articles/nrg.2017.19

https://academic.oup.com/nar/article/49/D1/D831/5920517#google_vignette

https://link.springer.com/article/10.1186/s12864-016-3080-9

https://link.springer.com/article/10.1186/s12864-015-1551-z

https://academic.oup.com/ismej/article/8/10/2056/7582424

https://academic.oup.com/gigascience/article/6/12/gix105/4628125

https://www.pnas.org/doi/abs/10.1073/pnas.1819657116