Data

Fantastic Data and Where to Find It

Airline On-Time Performance Data

The Bureau of Transportation Statistics publishes on-time arrival data for non-stop domestic flights by major air carriers. This application looks at all flights flying in/out of Illinois in 2017.

OpenFlights Airports Database

The OpenFlights Airports Database contains over 10,000 airports, train stations and ferry terminals spanning the globe, as shown in the map above. We use IATA code to obtain geo-location and timezone data for each IL airport.

GHCN Daily Data

The National Oceanic and Atmospheric Administration (NOAA) provides access to Global Historical Climatology Network (GHCN), which provides daily, monthly, and yearly temperature, precipitation, and snow records over global land areas. We have used the GHCN-Daily data for 2017 to determine the closest weather station to each airport in the US.


Data Mangling Stuff ( Skip to Deployment ):

You need R and preferrably R studio and SQL Server 2016 (we have used Microsoft SQL Server 2016 x64 ) installed to do the data mangling required for this project :

  1. Create a database ‘cs424’ in SQL Server.
  2. Download the OTP Data and a table ‘On_Time_Performance_2017’.
  3. Download the ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt and create a table ‘ Stations’.
  4. Download the 2017 archive from ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/by_year/ and create a table ‘weather’.
  5. Download the https://raw.githubusercontent.com/jpatokal/openflights/master/data/airports.dat and create ‘Airport’.
  6. Download the ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt and create a table ‘Station’.
  7. Run the Queries from theSQL file https://github.com/vrevan2/LearningToFly/blob/master/Application/data_mangling.sql and store the query results as .csv file.
For creating the DeepDiveMap data run the map.R to create the cancelled flights data by date and data for all the flights. Run the compiledWeatherData.R to get the final data file for the map.

Instructions for Deployment :

  1. Install R Studio and R : https://www.rstudio.com/products/rstudio/download/
  2. Clone the repository: https://github.com/vrevan2/LearningToFly
  3. Install Libraries mentioned in the library section of the App.R
  4. Check rgdal version by the running the command gdalinfo --version For rgdal version less than 2.2 : Change the line 856 from layer = ‘us_states_hexgrid‘ to layer = 'OGRGeoJSON'
  5. Finally Run the App.