Context embeddings trained on short sequence FASTA files with 3D structures

Published: 24 April 2023| Version 1 | DOI: 10.17632/jrvm4bbc5h.1
Contributor:
Daniel Um

Description

Data to generate "FIGURE 3: Context embeddings trained on short sequence FASTA files with 3D structures – N = 17552" for "Vector Embeddings by Sequence Similarity and Context for Improved Compression, Similarity Search, Clustering, Organization, and Manipulation of cDNA Libraries" by Daniel H. Um, et al.

Files

Steps to reproduce

Use Context_Embeddings_3D_Plot_Generation_Script.ipynb on 3D_Structure_1_200_seq_length_17552.fasta to generate 3D_plot_FASTA_files_with_3D_structures.png.

Institutions

Columbia University

Categories

Bioinformatics, Machine Learning

Licence