IMDB-Data-Analysis-in-SQL

This project was carried out to answer a set of analytical questions to suggest a movie production house on which set of actors, directors, and production houses would be the best fit for a super hit commercial movie..

glow (1)

Table of Content (TOC)

  • Database Creation for the Project
  • Table Creation
  • Data Insertion

Data Analysis

  • EXECUTIVE SUMMARY AND RECOMMENDATIONS

1. Overview

This analysis is carried out to support RSVP Movies with a well-analyzed list of global stars to plan a movie for the global audience in 2022.

With this, we will be able to answer a set of analytical questions to suggest RSVP Production House on which set of actors, directors, and production houses would be the best fit for a super hit commercial movie.

IMDB Data Analysis in MySQL

RSVP Movies is an Indian film production company that has produced many super-hit movies. They have usually released movies for the Indian audience but for their next project, they are planning to release a movie for the global audience in 2022.

Why this Analysis?

The production company wants to plan its every move analytically based on data and has approached for help with this new project.

We have been provided with the data of the movies that have been released in the past three years. Let’s analyze the data set and draw meaningful insights that can help them start their new project.

We will use SQL to analyze the given data and give recommendations to RSVP Movies based on the insights.

We will be carrying out the entire analytics process into four segments, where each segment leads to significant insights from different combinations of tables.

2. Database Creation for the Project

A. check the list of database.

  • The very first step of any MySQL analysis is to access the database and check if related data is available or not.
  • Use show databases; to access the list of databases:
Database
classicmodels
company
information_schema
market_star_schema
org

b. Create Database

  • Create a new database for this project.
  • Use Create database IMDB;
  • Use show databases; to confirm the list of databases:
Database
classicmodels
company
imdb
information_schema
market_star_schema
org

c. Use Database

  • Instruct the system to use *IMDB Database* by running use imdb;

3. Table Creation

Steps to follow before creating the table:.

  • Download the IMDb dataset. And try to understanding every table and its importance.
  • Understand the ERD and the table details. Study them carefully and understand the relationships between the table.

image

  • Inspect each table given in the subsequent tabs and understand the features associated with each of them.
  • Draft your table with the correct Data Type and Constraints in a paper or note file.
  • Open your MySQL Workbench and start writing the DDL and DML commands to create the database.

Create Table

For this project we need a total of 6 tables:

Table Number Tables_in_imdb
1 director_mapping
2 genre
3 movie
4 names
5 ratings
6 role_mapping

a. Create Table Movie

Table Name: Movie Column Description
id Movie Id is a unique ID associated with each movie
title Title of the movie
year year of Release
date_published Date of Movie Release
duration Duration of Movie
country Country of Release
worlwide_gross_income worlwide_gross_income
languages Languages released in
production_company production company associated with the movie

b. Create Table Genre

Table Name: Genre Column Description
movie_id Movie Id of the movie
genre Genre tagged for movie

c. Create Table director_mapping

| Table Name: director_mapping | Column Description | | ———– | ———– | | movie_id | Movie Id of the movie directed by a director | | name_id | Name ID of the director |

d. Create Table role_mapping

Table Name: Role_Mapping Column Description
movie_id Movie Id of the movies
name_id Name ID of the associated person
category Associated responsibility like Actor, director on a movie

e. Create Table names

Table Name: Names Column Description
id Name ID of each individual
name Name of each individual
height Height of individual
date_of_birth DOB
known_for_movies Famous or well known movie

f. Create Table ratings

Table Name: Ratings Column Description
movie_id Movie Id of the movie
avg_rating Average Rating of Movie
total_votes Total vote counts
median_rating Median Rating of the movie

Now, Run show tables; to ensure that all the six tables are created.

4. Data Insertion

In the previous steps, we created six tables. Now, we will insert the data into these tables. Here, we will be showing the syntax of 5 rows insertion into each table. (The complete data insertion syntax is available in the Repository)

a. Inserting data into Movie Table

B. inserting data into genre table, c. inserting data into director_mapping table, d. inserting data into role_mapping table, e. inserting data into names table, f. inserting data into ratings table, checking tables for inserted values:.

Select * from Movie;

Select * from Genre;

Select * from Director_Mapping;

Select * from Role_Mapping;

Select * from Names;

Select * from Ratings;

All the sample data inserted looks good. SO, we can go ahead with insertion of complete data. For insertion to work smoothly, lets drop all data from tables using TRUNCATE :

Insert Complete data

Run the command to insert complete data: IMDB File 3 Insert all data

1. Find the total number of rows in each table of the schema?

Alternative 1:.

Number of Rows after ignoring the Null Rows

Alternative 2:

Rows count inclusive of Null Rows:

TABLE_NAME Tables_in_imdb director_mapping 3867 genre 14662 movie 8519 names 23714 ratings 8230 role_mapping 15173

2. Which columns in the movie table have null values?

id_null title_null year_null date_null duration_null country_null world_null language_null production_null 0 0 0 0 0 20 3724 194 528

3.1. Find the total number of movies released each year?

Movies per year:, 3.2. find the total number of movies released each year, movies per month, 4.1 find the count of indian movies., 4.2 find the count of movies from usa, 4.3 find the count of movies which are either from india or usa, 4.4 find the count of movies that are either from india or usa and released in 2019., 5. find the unique list of the genres present in the data set, 6.1 find the movies count for each genre., 6.2 find the genre with the maximum number of movies., 6.3 find the genre with minimum number of movies., 6.4 find the top-3 genre with the maximum number of movies., 6.4 find the movies count for action genre., 6.5 find the genre count for each movie., 6.6 find the list of indian movies that belongs to 3 genre., 6.7 longest indian movie tagged with 3 genre..

‘tt6200656’, ‘Kammara Sambhavam’, ‘182’, ‘3’

6.8 Which genres are tagged with ‘Kammara Sambhavam’ movie.

genre Action Comedy Drama

7.1. How many movies belong to only one genre?

Create a list of Movies with a genre count
Restrict the list to Genre count = 1
Count the total number of rows

7.2. How many movies belong to two genres?

7.3. how many movies belong to three genres, 8.1. what is the average duration of movies in each genre, 8.2. rank the genre by the average duration of movies in each genre., 9. what is the rank of the ‘thriller’ genre of movies among all the genres in terms of the number of movies produced, 10. find the minimum and maximum values in each column of the rating table except the movie_id column, 11. which are the top 10 movies based on average rating, 12. summarize the ratings table based on the movie counts by median ratings., 13. which production house has produced the most number of hit movies (average rating > 8).

Create list of production house with count of movies where average rating > 8 and Ranked over “Movies count”
Applied CTE to pull the production house with Rank = 1
NOTE: applied (production_company IS NOT NULL) as there are few movies without production house name

14. How many movies released in each genre during March 2017 in the USA had more than 1,000 votes?

15. find movies of each genre that start with the word ‘the’ and which have an average rating > 8, 16. of the movies released between 1 april 2018 and 1 april 2019, how many were given a median rating of 8, 17. do german movies get more votes than italian movies, q18. which columns in the names table have null values, 19. who are the top three directors in the top three genres whose movies have an average rating > 8.

Pull the Top three Genre by Movie count where avg_rating > 8

Pull the Directors with Movie count where avg_rating > 8

Keeping “top_3_genres” as CTE, restrict the 2nd code to avg_rating > 8 and directors of top_3_genre

Trying Row_Number() function:

20. who are the top two actors whose movies have a median rating >= 8, 21. which are the top three production houses based on the number of votes received by their movies, 22. rank actors with movies released in india based on their average ratings. which actor is at the top of the list.

– Note: The actor should have acted in at least five Indian movies.

ALTERNTIVE 1 (Using Rank Window Function):

Alternative 2 (using cte):, 23.find out the top five actresses in hindi movies released in india based on their average ratings.

– Note: The actresses should have acted in at least three Indian movies.

24. Select thriller movies as per avg rating and classify them in the following category:

Rating > 8: Superhit movies
Rating between 7 and 8: Hit movies
Rating between 5 and 7: One-time-watch movies
Rating < 5: Flop movies

——————————————————————————————–*/

EXECUTIVE SUMMARY AND RECOMMENDATIONS {##-EXECUTIVE-SUMMARY-AND-RECOMMENDATIONS}

1. insights.

Based on 7,997 released and recorded on IMDB between 2017 and 2019, a summary of audience interest and recommendations are mentioned as below:

  • Average Duration: 103.89359
  • Total number of Actors: 12611 (7445 actor & 5166 Actress)

1. Year and Month wise Movie Release Pattern:

  • A year wise record of movies indicates a slight decrease in number of movies from 3052 movies in 2017 to 2001 movies in 2019.
  • Maximum number of movies were released in March, followed by September, October, and January. While more interesting fact is about the least number of movies being released in mid-year and end of year months, could be because of more people prefer vacation and family time in this time of year.

2. Geographical Region Distribution

  • USA and India produced 1059 movies together in 2019 alone, way above half of total movies released (2001) in the year.

3. Genre Popularity

  • Movies were tagged with genre tags as Drama, Fantasy, Thriller, Comedy, Horror, Family, Romance, Adventure, Action, Sci-Fi, Crime, and Mystery.
  • Drama is most popular genre among all the genre with 4285 tags across three years, followed by Comedy and Thriller.
  • There were 3289 movies with only one genre tags, while remaining were tagged with multiple genres.

4. The average duration of movies are around 103.89359 minutes, and even genre vise average revolves around the same figure.

5. top production houses.

  • Marvel Studios rules the best Production House category with 551245 votes based on the number of votes received by the movies they have produced, followed by Syncopy, and New Line Cinema.
  • Star Cinema, and Twentieth Century Fox are the top 2 multi-Lingual production house based on the most number of superhit movies.

6. Top Director

  • James Mangold has given most number of Superhit Movies, followed by Soubin Shahir, Joe Russo, and Anthony Russo.
  • A.L. Vijay, Andrew Jones, and Chris Stokes are the top directors based on number of movies.

7. Top Actors and Actress

  • Mammootty with 8 Superhit movies is most successful actor followed by Mohanlal with 5 Superhits.
  • There are quite a few number of actors with 4 Superhit movies under their name, which include Amrinder Gill, Amit Sadh, Johnny Yong Bosch, Tovino Thomas, Dulquer Salmaan, Siddique, Rajkummar Rao, Fahadh Faasil, Pankaj Tripathi, Dileesh Pothan, Joju George, and Ayushmann Khurrana.
  • Vijay Sethupathi, Fahadh Faasil, and Yogi Babu are the top three Indian actors who have acted atleast in five movies.
  • Taapsee Pannu, Divya Dutta, and Kriti Kharbanda are the top three Hindi Speaking actress who have acted at least in three movies.
  • Parvathy Thiruvothu, Susan Brown, and Amanda Lawrence are the best rated actresses in Drama genre.

8. Top-10 movies based on average rating are: Kirket, Love in Kilnerry, Gini Helida Kathe, Runam, Fan, Android Kunjappan Version 5.25, Yeh Suhaagraat Impossible, Safe, The Brighton Miracle, and Shibu

  • Based on Median rating counts, most of the movies are rated between 5 and 8, and falls under hit movie categories.

9. Top Grossing Movies

The highest-grossing movies of each year are:

i. Thank You for Your Service, a comedy movie released in 2017

ii. The Villain, a thriller movie released in 2018

iii. Joker, a drama movie released in 2019

2. Recommendation:

Based on Insights, the recommendations for RSVP are as following:

  • Concentrate on multi-genre drama-comedy movies with a pinch of thriller, keeping an average duration of around 104 minutes.
  • Plan for release of movie between January to March. Focus on multilingual movies which can be launched in India and USA as preferred audience market.
  • Rope in either Star Cinema or Twentieth Century Fox as the production house, under the directorial of James Mangold with assistance of A.L. Vijay.
  • Mammootty and Mohanlal can be the lead actors along with assistance from other side actors. Inclusion of Vijay Sethupathi would act as stardom promotion for the movie.
  • Parvathy Thiruvothu is one of the most rated drama actresses to be brought in.

Use SQL on a Movie Database to Decide What to Watch

Author's photo

Table of Contents

Completing the SQL Movie Database Download

Sql exercises on a movie database, finding all the movies for a given director, using sql on a large existing movie database.

We’ll demonstrate how to use SQL to parse large datasets and gain valuable insights, in this case, to help you choose what movie to watch next using an IMDb dataset.

In this article, we’ll be downloading a dataset directory from IMDb. Not sure what to watch tonight? Are you browsing Netflix endlessly? Decide what to watch using the power of SQL! We’ll be loading an existing movie IMDb dataset into SQL. We’ll analyze the data in different ways like sorting movies by their rating, by what actors star in the movie, or by other similar criteria.

As mentioned in this blog post on how to practice SQL , the best way to practice SQL is by gaining hands-on experience in solving real-world problems, which is exactly what we’ll be doing.

If you have a basic knowledge of SQL, you should be able to follow this article easily. If you have no IT experience whatsoever, consider starting with this SQL A to Z Learning Track designed for people who have no experience in IT and want to start their adventure with SQL.

Let’s get started by learning how to get the movie data into our SQL database.

Let’s walk through the process of downloading our data and loading it into a database management system (DBMS), step by step. Common DBMSs include MySQL, Oracle DB, PostgreSQL, and SQL Server.

Although this article focuses on movie data, you can choose an entirely different dataset. Check out this list of free online datasets you can use and find the one you are interested in. The import of these datasets will be similar regardless of what dataset you use.

Open whatever variety of SQL you are using. For this example, I’ll be using SQL Server Management Studio, but the steps should be similar for all of the other varieties of SQL out there. Let’s get started:

  • The dataset files can be accessed and downloaded from https://datasets.imdbws.com/ . The data is refreshed daily.
  • basics.tsv.gz
  • akas.tsv.gz
  • crew.tsv.gz
  • episode.tsv.gz
  • principals.tsv.gz
  • ratings.tsv.gz
  • Extract the downloaded zip files. The end result will be a TSV (tab-separated) file for each table.
  • Open each file in a spreadsheet application like Google Sheets or Microsoft Excel.
  • Find and replace all occurrences of “\N” with an empty cell.
  • Save the file as a CSV file. This will make it easier to import into the DBMS of your choice.
  • Open your DBMS.
  • Create a new schema or table by right-clicking on the left pane and selecting “New Database.” I’ve named my new database “imdb.”

SQL movie database

  • Set valid data types for each column you are importing. I recommend using nvarchar(MAX) for string columns, since you do not know how long the strings will be for each field. You can change the column datatype later if required.

SQL movie database

  • Repeat this process for each of the files you have downloaded.

After completing these steps, your SQL movie database will be in place! You are now ready to start analyzing and querying the data.

Thankfully, this dataset came with some descriptive documentation . To get an even better idea of the data, you can quickly select the top 1000 rows from each table.

Let’s start looking for our first movie. Imagine you want to watch a horror movie. How can we isolate only the horror movies? Fortunately, this task is frighteningly simple.

If this query causes any confusion, open this SQL cheat sheet to refresh your knowledge. Have this cheat sheet open for the rest of the tutorial to help you along!

What if we wanted to refine this horror movie list further? We could restrict the results to horror movies created after 1990, with an average rating above 9.0 and at least 10,000 votes.

This will involve getting data from multiple tables. Opening each table and taking a look at the column headers, we can see the following tables will be involved:

  • title_basics : handles the genre of movie and the release year (represented by the column startYear ).
  • title_ratings : handles the rating ( averageRating ) and votes ( numVotes ).

The two tables can be joined on the shared column, tconst . As explained in the IMDb documentation here , tconst is an alphanumeric unique identifier of the title. Let’s write our query:

titleTypeprimaryTitlestartYeargenresaverageRatingnumVotes
videoGameResident Evil 42005Action,Adventure,Horror9.211406

Executing this query returns a single result, but not the result we want! On closer inspection, we can see that this title is a video game, not a movie. Let’s alter our query to include only movies, and expand the search by reducing the minimum number of votes required to 1,000 and the minimum rating required to 8.0.

titleTypeprimaryTitlestartYeargenresaverageRatingnumVotes
movieManichitrathazhu1993Comedy,Horror,Music8.79468

Executing this query also yields a single result! Looks like we won’t have to decide what to watch anymore, since there’s only one option that fits our criteria!

Let’s run through another scenario. What if we want to see all of the movies Steven Spielberg has directed? How would this work?

By looking through the tables, we can determine the following:

  • name_basics : It contains the names of all actors, writers, directors, and others involved in the creation of film and TV titles.
  • title_crew : It acts as a linking table for titles, directors, and writers. We’ll use this table to connect Steven Spielberg to the titles he’s involved with.
  • title_basics : We have already used this table. It contains title information like name, release date, rating, etc.

Let’s get to work! Let’s write a query for the name_basics table to try and find the famous director Steven Spielberg.

Executing this query yields a single result:

nconstprimaryNamebirthYeardeathYearprimaryProfessionknownForTitles
nm0000229Steven Spielberg1946NULLproducer,writer,directortt0082971,tt0083866,tt0120815,tt0108052

This gives us the important value of nconst . From the documentation, we know that nconst is the alphanumeric unique identifier of the name/person.

We can feed this value into the title_crew table, which contains the director and writer information for all the titles in IMDb, and match Steven Spielberg to all the titles he’s involved with.

Executing this query results in a list of 45 titles. You can see from the value of the directors column that Steven Spielberg was the director of them all.

We need a way of using this list of titles alongside the title_basics table to get the name of the movies instead of just the tconst. Let’s use a subquery for this!

Execute this query to see the result:

titleTypeprimaryTitlestartYeargenres
movieFirelight1964Sci-Fi,Thriller
movieThe Sugarland Express1974Crime,Drama
movieJaws1975Adventure,Thriller
movieClose Encounters of the Third Kind1977Drama,Sci-Fi
movie19411979Action,Comedy,War
movieIndiana Jones and the Raiders of the Lost Ark1981Action,Adventure
movieE.T. the Extra-Terrestrial1982Family,Sci-Fi
movieIndiana Jones and the Temple of Doom1984Action,Adventure
movieThe Color Purple1985Drama
movieEmpire of the Sun1987Action,Drama,History
movieAlways1989Drama,Fantasy,Romance
movieIndiana Jones and the Last Crusade1989Action,Adventure
movieHook1991Adventure,Comedy,Family
movieJurassic Park1993Action,Adventure,Sci-Fi
movieSchindler's List1993Biography,Drama,History
movieAmistad1997Biography,Drama,History
movieThe Lost World: Jurassic Park1997Action,Adventure,Sci-Fi
movieSaving Private Ryan1998Drama,War
movieMinority Report2002Action,Crime,Mystery
movieA.I. Artificial Intelligence2001Drama,Sci-Fi
movieCatch Me If You Can2002Biography,Crime,Drama
movieThe Terminal2004Comedy,Drama,Romance
movieIndiana Jones and the Kingdom of the Crystal Skull2008Action,Adventure
movieWar of the Worlds2005Adventure,Sci-Fi,Thriller
movieMunich2005Action,Drama,History
movieLincoln2012Biography,Drama,History
movieThe Adventures of Tintin2011Action,Adventure,Animation

There we have it, all of the Steven Spielberg movie titles from our database!

Don’t stop here! Write your own custom queries to extract more insights from this large dataset. There are many ways to practice SQL. If you feel like you’ve had enough of working with this dataset, check out this post on 12 Ways to Learn SQL Online for more excellent learning resources.

You have learned how to import and analyze large existing datasets into the DBMS of your choice and to use SQL to analyze a movie database. This is a powerful tool in your SQL arsenal. Not to mention, you’ll never have to worry about not being able to choose a movie to watch again! Completing SQL exercises on movie databases is a helpful way to learn, but if you would like more structure, check out this SQL Practice Set from LearnSQL.com .

You may also like

imdb sql assignment kaggle

How Do You Write a SELECT Statement in SQL?

imdb sql assignment kaggle

What Is a Foreign Key in SQL?

imdb sql assignment kaggle

Enumerate and Explain All the Basic Elements of an SQL Query

Datasets: stanfordnlp / imdb like 216

lengths

Dataset Card for "imdb"

Dataset summary.

Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.

Supported Tasks and Leaderboards

More Information Needed

Dataset Structure

Data instances.

  • Size of downloaded dataset files: 84.13 MB
  • Size of the generated dataset: 133.23 MB
  • Total amount of disk used: 217.35 MB

An example of 'train' looks as follows.

Data Fields

The data fields are the same among all splits.

  • text : a string feature.
  • label : a classification label, with possible values including neg (0), pos (1).

Data Splits

name train unsupervised test
plain_text 25000 50000 25000

Dataset Creation

Curation rationale, source data, initial data collection and normalization, who are the source language producers, annotations, annotation process, who are the annotators, personal and sensitive information, considerations for using the data, social impact of dataset, discussion of biases, other known limitations, additional information, dataset curators, licensing information, citation information, contributions.

Thanks to @ghazi-f , @patrickvonplaten , @lhoestq , @thomwolf for adding this dataset.

Models trained or fine-tuned on stanfordnlp/imdb

imdb sql assignment kaggle

tasksource/deberta-small-long-nli

Sileod/deberta-v3-base-tasksource-nli, jiaqilee/imdb-finetuned-bert-base-uncased.

imdb sql assignment kaggle

lvwerra/distilbert-imdb

Tasksource/deberta-base-long-nli.

imdb sql assignment kaggle

edbeeching/gpt2-imdb

Spaces using stanfordnlp/imdb 24.

IMDb Non-Commercial Datasets

Subsets of IMDb data are available for access to customers for personal and non-commercial use. You can hold local copies of this data, and it is subject to our terms and conditions. Please refer to the Non-Commercial Licensing and copyright/license and verify compliance.

As of March 18, 2024 the datasets on this page are backed by a new data source. There has been no change in location or schema, but if you encounter issues with the datasets following the March 18th update, please contact [email protected].

Data Location

The dataset files can be accessed and downloaded from https://datasets.imdbws.com/ . The data is refreshed daily.

IMDb Dataset Details

Each dataset is contained in a gzipped, tab-separated-values (TSV) formatted file in the UTF-8 character set. The first line in each file contains headers that describe what is in each column. A ‘\N’ is used to denote that a particular field is missing or null for that title/name. The available datasets are as follows:

title.akas.tsv.gz

  • titleId (string) - a tconst, an alphanumeric unique identifier of the title
  • ordering (integer) – a number to uniquely identify rows for a given titleId
  • title (string) – the localized title
  • region (string) - the region for this version of the title
  • language (string) - the language of the title
  • types (array) - Enumerated set of attributes for this alternative title. One or more of the following: "alternative", "dvd", "festival", "tv", "video", "working", "original", "imdbDisplay". New values may be added in the future without warning
  • attributes (array) - Additional terms to describe this alternative title, not enumerated
  • isOriginalTitle (boolean) – 0: not original title; 1: original title

title.basics.tsv.gz

  • tconst (string) - alphanumeric unique identifier of the title
  • titleType (string) – the type/format of the title (e.g. movie, short, tvseries, tvepisode, video, etc)
  • primaryTitle (string) – the more popular title / the title used by the filmmakers on promotional materials at the point of release
  • originalTitle (string) - original title, in the original language
  • isAdult (boolean) - 0: non-adult title; 1: adult title
  • startYear (YYYY) – represents the release year of a title. In the case of TV Series, it is the series start year
  • endYear (YYYY) – TV Series end year. ‘\N’ for all other title types
  • runtimeMinutes – primary runtime of the title, in minutes
  • genres (string array) – includes up to three genres associated with the title

title.crew.tsv.gz

  • directors (array of nconsts) - director(s) of the given title
  • writers (array of nconsts) – writer(s) of the given title

title.episode.tsv.gz

  • tconst (string) - alphanumeric identifier of episode
  • parentTconst (string) - alphanumeric identifier of the parent TV Series
  • seasonNumber (integer) – season number the episode belongs to
  • episodeNumber (integer) – episode number of the tconst in the TV series

title.principals.tsv.gz

  • nconst (string) - alphanumeric unique identifier of the name/person
  • category (string) - the category of job that person was in
  • job (string) - the specific job title if applicable, else '\N'
  • characters (string) - the name of the character played if applicable, else '\N'

title.ratings.tsv.gz

  • averageRating – weighted average of all the individual user ratings
  • numVotes - number of votes the title has received

name.basics.tsv.gz

  • primaryName (string)– name by which the person is most often credited
  • birthYear – in YYYY format
  • deathYear – in YYYY format if applicable, else '\N'
  • primaryProfession (array of strings)– the top-3 professions of the person
  • knownForTitles (array of tconsts) – titles the person is known for

Get started

Contact us to see how IMDb data can solve your customers needs.

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

Assignment on IMDB database using sqlite3 and pandas

GopiSumanth/SQL

Folders and files.

NameName
3 Commits

Repository files navigation

Assignment on IMDB database using sqlite3 and pandas This repository contains Db-IMDB database and its schema is in db_schema file. Required SQL commands are present in mySql Commands file. It is kind of my notes on SQL The Assignment questions are present in sql_questions file and the solutions are present in solutions.ipynb

NOTE: If anyone found better way to solve the assignment questions kindly let me know. My email: [email protected]

  • Jupyter Notebook 100.0%

IMAGES

  1. JupyterLab IMDB SQL Assigmnent Answer Query

    imdb sql assignment kaggle

  2. JupyterLab IMDB SQL Assigmnent Answer Query

    imdb sql assignment kaggle

  3. IMDB SQL Data Analysis : Part II

    imdb sql assignment kaggle

  4. IMDB SQL dataset project

    imdb sql assignment kaggle

  5. RSVP-Movies-SQL-Assignment/IMDB+question+solutions.sql at main · VISHAKHA-stack/RSVP-Movies-SQL

    imdb sql assignment kaggle

  6. SQL Practice Assignment.docx

    imdb sql assignment kaggle

VIDEO

  1. Use SQL to calculate population of the US

  2. SQL (Structured Query Language) Class15

  3. SQL || Milestone 1 || Assignment

  4. Preview Embedded SQL

  5. Importing #NBA Datasets From Kaggle .CSV to #MySQL Workbench

  6. Building an SQL Practice Playground with Frappe Framework, SQLite and FrappeUI

COMMENTS

  1. IMDb Project (SQL)

    Kaggle is the world's largest data science community with powerful tools and resources to help you achieve your data science goals.

  2. SQL queries performed on IMDb database to provide ...

    SQL queries performed on IMDb database to provide recommendations to RSVP Movies based on insights. sql imdb-dataset rsvp-movies Readme Activity 7 stars 1 watching 9 forks Report repository

  3. SriHarshaRomp/RSVP-Movies-Case-Study

    The questions in each segment with business objectives are written in the script given below. You have to write the solution code below every question and submit the same SQL script file with the solution in the 'Submission' segment. About the assignment Where do I get the data from?

  4. GitHub

    SQl practice with imdb data. Contribute to stegemang/sqlimdb development by creating an account on GitHub.

  5. Data Analysis End-to-End IMDb dataset

    End-to-End data analysis of the famous IMDb data set with business context, codes and visuals.

  6. Exploring IMDb Data Through SQL Queries

    The IMDb dataset is a treasure trove of information for movie enthusiasts and data analysts alike. In this article, we'll embark on a journey through the IMDb dataset using SQL queries to answer ...

  7. IMDB SQL Data Analysis : Part II

    Movies and TV Series have always been my favorite pastime. My first Data analysis project for GDAC was based on the Movie Industry dataset in Kaggle — How To Invest in Popular/Profitable Movies ...

  8. IMDb Project (SQL)

    Explore and run machine learning code with Kaggle Notebooks | Using data from IMDb Project (SQL)

  9. IMDB Movie Dataset Analysis

    Domain: Movies Tech Stack: SQL Objective: RSVP Movies plans to produce next movie based on data of highest rated movies released in the past three years Key Achievement: Found the correct genre ...

  10. IMDB-Data-Analysis-in-SQL

    This project was carried out to answer a set of analytical questions to suggest a Movie Production House on which set of actors, directors, and production houses would be the best fit for a super hit commercial movie.

  11. Use SQL on a Movie Database to Decide What to Watch

    SQL is the best way to interact with large datasets. This article demonstrates how to query a vast existing movie dataset from IMDb.

  12. IMDb Data Science

    IMDb (Internet Movie Database) is one of the most recognized names for its comprehensive online database collection of movies, films, TV series and so on. As of today (July 2020), you'll see through the following data pull that IMDb database has approximately 7 million titles. In this article, I will use Python in Jupyter Notebook to demonstrate where to pull the data, how to quickly ...

  13. SQL-on-IMDB-dataset/README.md at main

    Db-IMDB-Assignment.db - Sample IMDB database that we would be using. sql_questions.pdf - List of 10 SQL problems. sql_on_IMDB_dataset.ipynb - IPython Notebook with all the solutions. We would be using python pandas library in a ipython notebook to coonect to the given database and run our sql queries.

  14. Imdb Dataset Analysis (SQL Query)

    Explore and run machine learning code with Kaggle Notebooks | Using data from imdb-sqlite-dataset

  15. stanfordnlp/imdb · Datasets at Hugging Face

    The actors fall in love at first sight, words are unnecessary. In the director's own experience in Hollywood that is what happens when they go to work on the set. It is reality to him, and his peers, but it is a fantasy to most of us in the real world. So, in the end, the movie is hollow, and shallow, and message-less.

  16. IMDb Project (SQL)

    Explore and run machine learning code with Kaggle Notebooks | Using data from IMDb Project (SQL)

  17. IMDb Non-Commercial Datasets

    IMDb Non-Commercial Datasets Subsets of IMDb data are available for access to customers for personal and non-commercial use. You can hold local copies of this data, and it is subject to our terms and conditions. Please refer to the Non-Commercial Licensing and copyright/license and verify compliance.

  18. What I Learned from the 5-Day SQL Scavenger Hunt on Kaggle ...

    Much More Than SQL Alone With that being said, let me share with you what I appreciated the most about the recent SQL Scavenger Hunt on Kaggle, which was so nicely put together by Rachel Tatman.

  19. GitHub

    Assignment on IMDB database using sqlite3 and pandas This repository contains Db-IMDB database and its schema is in db_schema file. Required SQL commands are present in mySql Commands file.

  20. IMDb Movie

    Explore and run machine learning code with Kaggle Notebooks | Using data from Top 100 IMDB Movies Dataset.

  21. 14. Project

    14. Project - 9 | Data Analysis | IMDB Movie Dataset | Python Pandas Project | Kaggle Dataset Data Thinkers 15.7K subscribers Subscribed 1.1K 57K views 3 years ago #DataAnalysisProject #PandasProject

  22. IMDb 5000 Movie Dataset

    Explore and run machine learning code with Kaggle Notebooks | Using data from IMDB 5000 Movie Dataset.

  23. IMDb Dataset

    Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. Dataset from IMDb to make a recommendation system