RNA-Protein Interaction Profiles Across Six Human Cell Lines
by Yubo Wang·Updated 2mo ago
20.8 GB7files
Available on 1 platform
Sign in to view source links and access this dataset
Description
261 unified RNA-RBP interaction datasets from 172 RBPs across six human cell lines (K562, HepG2, HEK293, HEK293T, HeLa, and H9). The collection includes 65 CLIP-seq experiments from POSTAR and 196 eCLIP experiments from ENCODE, providing a basis for model training. Author Yubo Wang uploaded the 20.8 GB dataset to figshare under a CC-BY-4.0 license.
Use Cases
Train machine learning models to predict RNA-protein binding based on sequence and structure motifs.
Evaluate the impact of genetic variants on RNA-protein interactions across diverse cellular contexts.
Benchmark computational methods for analyzing CLIP-seq and eCLIP data.
Study the cell-type specificity of RNA-binding protein interactions.
Strengths
Unifies 261 datasets from two major sources (POSTAR and ENCODE) into a single collection.
Profiles 172 RNA-binding proteins across six distinct human cell lines, suggesting broad cellular coverage.
Data was generated using a uniform flagmarked protocol, which may improve cross-experiment comparability.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment for specific modeling tasks.
The 20.8 GB size indicates a large download and potential computational resource requirements.
Provenance
Source
Combined from POSTAR repository and ENCODE project.
Collection Method
CLIP-seq and eCLIP experiments processed with a uniform flagmarked protocol.
Freshness
Last updated 2026-05-02 14:26:22; freshness should be verified.
Files are packaged in RAR and ZIP formats; a SHA256 checksum is provided for verification.