250 Hollywood movie scripts compiled by Pratik Kalamkar. The dataset is described as balanced and includes age rating information. The specific source, time period, and script content details are not provided.
Use Cases
- Analyze linguistic patterns and genre characteristics based on movie script text.
- Train models to classify or predict age ratings based on script content.
- Study the relationship between script dialogue and assigned audience ratings.
- Perform comparative analysis across different age rating categories using the balanced sample.
Strengths
- Contains 250 movie scripts, providing a substantial sample.
- Includes age rating labels, enabling supervised learning tasks.
- Described as 'balanced', which may indicate even distribution across rating categories.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown beyond the title's mention of 250; other dataset dimensions are unspecified.
- Last update date is unknown; freshness unverified.
Provenance
- Source
- Kaggle user Pratik Kalamkar