I have noticed that my schedule table, which is built by systematically downloading schedule information for the top 100 airlines (from CountAllEnrouteOperations) contains every flight from Aeroflot twice, but under different ICAO codes: AFL (real) and SMM (false)
This seems to be a bug in the reference data of FlightAware.
offtopic Simferopol de-facto is RU
cbw
November 6, 2017, 11:26pm
3
Im not sure where the correlation is coming from. Could you provide any other info on the SMM flights?
Fligtaware contains a lot of mistakes regarding flights of Russian airlines…
For example, in app reviews https://play.google.com/store/apps/details?id=com.flightaware.android.liveFlightTracker you can notice very disappointed Russian users:
“… wrong an airlines, wrong a flights …”
Use this code with ICAO code 'AFL ’ and see what you get:
# ## AirlineFlightSchedules
import requests
from json import loads, dumps
from string import ascii_uppercase
import time
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
import wait
t1 = int(time.time())
t2 = int(t1 + 3600 * 24)
# --- GET SCHEDULE ---
def get_Schedule_by_Airline(airline):
# -- Details for the API
username = "funkmeister380"
list_of_flights = []
next_offset = -1
payload = {'start_date' : str(t1), 'end_date' : str(t2), 'airline' : airline, 'exclude_codeshare' : 'True',
'howMany':'150', 'offset' : '0'}
response = requests.get(fxmlUrl + "AirlineFlightSchedules",
params=payload, auth=(username, apiKey))
wait.wait()
kr = 'AirlineFlightSchedulesResult'
# -- parse the results
if response.status_code == 200:
r2 = loads(response.content)
if kr in r2:
list_of_flights = r2[kr]['flights']
next_offset = r2[kr]['next_offset']
else:
print "Error executing request"+ response.content
next_offset = -1
while next_offset > -1:
payload['offset'] = str(next_offset)
response = requests.get(fxmlUrl + "AirlineFlightSchedules",
params=payload, auth=(username, apiKey))
if response.status_code == 200:
r2 = loads(response.content)
list_of_flights.append(r2['AirlineFlightSchedulesResult']['flights'])
next_offset = r2['AirlineFlightSchedulesResult']['next_offset']
#print next_offset
if next_offset == -1:
break
wait.wait()
else:
print ("Stopped with offset" + str(next_offset) + response.content)
break
if len(list_of_flights) > 0:
flights_pd = pd.concat([pd.DataFrame(t, index=range(len(t))) for t in list_of_flights]) # for schedule results
print "-------------"
print airline
print flights_pd.groupby("destination").size().sort_values(ascending=False).head(10)
return flights_pd.drop_duplicates()
else:
print "No schedule for Airline "+airline
return
cbw
November 7, 2017, 8:55pm
7
Took some digging, but I believe we found the issue and corrected it. Thank you for the heads up on it.