See post for details.

Project Repository Link

NOTE: Private. Contains unpublished data from collaborators, so the link here and in this post will only work for folks who have been granted collaborative status on the repository until it can be made public later.

grace-ac/project-pycno-multispecies-2023

Alignments

P. helianthoides

Code: 12-hisat2_pycno.Rmd

Output:

Alignment MultiQC Report: pycno_2023_alignment_multiqc_report.html

Alignment Rates:
Exposed, Day 12

Sample ID Alignment Rate Species
PSC-0519 79.18% P. helianthoides
PSC-0525 79.40% P. helianthoides
PSC-0531 78.52% P. helianthoides
PSC-0537 77.72% P. helianthoides
PSC-0549 80.91% P. helianthoides
PSC-0561 81.66% P. helianthoides

23464 genes; 25831 transcripts.

P. ochraceus

Code: 13-hisat2_pisaster.Rmd

Output:

Alignment MultiQC Report: pisaster_2023_alignment_multiqc_report.html

Alignment Rates:
Exposed, Day 12

Sample ID Alignment Rate Species
PSC-0518 86.49% P. ochraceus
PSC-0524 85.01% P. ochraceus
PSC-0530 85.92% P. ochraceus
PSC-0536 85.03% P. ochraceus
PSC-0548 86.61% P. ochraceus
PSC-0560 83.84% P. ochraceus

32370 genes; 35696 transcripts.

D. imbricata

Code: 14-hisat2_derm.Rmd

Output:

Alignment MultiQC Report: dermasterias_2023_alignment_multiqc_report.html

Alignment Rates:
Exposed, Day 12

Sample ID Alignment Rate Species
PSC-0529 87.81% D. imbricata
PSC-0523 85.42% D. imbricata
PSC-0529 83.73% D. imbricata
PSC-0535 88.16% D. imbricata
PSC-0547 86.96% D. imbricata
PSC-0559 87.01% D. imbricata

22124 genes; 25030 transcripts.

Annotations

Each species has a protein annotation of the genome, each one called the same “augustus.hints.aa”. I ran blastp on each of these files to get uniprot/sprot annotation of the genes/transcripts which can then be join-ed with the count matrices for annotation.

P. helianthoides

Code:

Output:

Ran blastp on the augustus.hints.aa file from https://datadryad.org/dataset/doi:10.5061/dryad.51c59zwfd.

From the BLAST output, the uniprot accession IDs were then put in the ID Mapping feature at uniprot.org, then a big table of GO annotations was downloaded.

That table was join-ed with the Phel_aughintcod_blastout_sep.tab

THEN, I join-ed the P. helianthoides transcript count matrix with the big annotation table, resulting in this annotated transcript count matrix:

output/18-annot-pycno/Phel_transcript_counts_annotation.tab

P. ochraceus

Code:

Output:

Ran blastp on the augustus.hints.aa file from Mike Dawson and Lauren Schiebelhut (not publicly available yet).

From the BLAST output, the uniprot accession IDs were then put in the ID Mapping feature at uniprot.org, then a big table of GO annotations was downloaded.

That table was join-ed with the Pisaster_aughintcod_blastout_sep.tab.

THEN, I join-ed the P. ochraceus transcript count matrix with the big annotation table, resulting in this annotated transcript count matrix:

output/19-annot-pisaster/Pisaster_transcript_counts_annotation.tab

D. imbricata

Code:

Output:

Ran blastp on the augustus.hints.aa file from from Mike Dawson and Lauren Schiebelhut (not publicly available yet).

From the BLAST output, the uniprot accession IDs were then put in the ID Mapping feature at uniprot.org, then a big table of GO annotations was downloaded.

That table was join-ed with the Dermasterias_aughintcod_blastout_sep.tab.

THEN, I join-ed the D. imbricata transcript count matrix with the big annotation table, resulting in this annotated transcript count matrix:

output/20-annot-derm/Derm_transcript_counts_annotation.tab