What happened to Ryanair's flights?


#1

I have downloaded CountAllEnrouteAirlineOperations every hour and plotted the results as shown. Note how during a few days at the end of October the number of flights in the air diminishes for Ryanair. At first I thought there may be a data problem.

So I also analysed the data I obtained in a very different fashion: downloading the FlightInfoStatus for all flight numbers I know for RYR.

Again, I saw a similar pattern, where the number of daily flights drops dramatically during the same few days. Since the data was obtained “differently”, one could argue “it must be true”. However, it could still be a data problem, whereby FlightAware simply did not get any data from Ryanair for a certain portion of flights.

What do you think?

+----------+-----+
|      Date|count|
+----------+-----+
|2017-10-23| 1164|
|2017-10-24| 1148|
|2017-10-25| 1996|
|2017-10-26| 1978|
|2017-10-27| 2078|
|2017-10-28| 1954|
|2017-10-29|  602|
|2017-10-30|  571|
|2017-10-31|  379|
|2017-11-01|  378|
|2017-11-02|  744|
|2017-11-03| 1814|
|2017-11-04| 1658|

#2

That is inline with the change in IATA season.


#3

Oh - can you elaborate on that? It looks like for the other airlines the drop was much lower…
Also, according to my data, the number of scheduled flights does not change that much:

+----------+-----+
|      date|count|
+----------+-----+
|2017-10-24| 1913|
|2017-10-25| 1884|
|2017-10-26| 1534|
|2017-10-27| 2195|
|2017-10-28| 2086|
|2017-10-29| 2050|
|2017-10-30| 2079|
|2017-10-31| 1878|
|2017-11-01| 1888|
|2017-11-02| 1898|

#4

Ryan Air appears to have had a lot of cancellations on those days:

filed_departuretime | cancellations
---------------------+---------------
 2017-10-20          |            49
 2017-10-21          |            51
 2017-10-22          |            50
 2017-10-23          |            49
 2017-10-24          |            53
 2017-10-25          |            53
 2017-10-26          |            55
 2017-10-27          |            47
 2017-10-28          |            48
 2017-10-29          |          1419
 2017-10-30          |          1397
 2017-10-31          |          1306
 2017-11-01          |          1318
 2017-11-02          |          1241
 2017-11-03          |            19
 2017-11-04          |             8
 2017-11-05          |            20
 2017-11-06          |             5
 2017-11-07          |            13
 2017-11-08          |             4
 2017-11-09          |             7
 2017-11-10          |            10
 2017-11-11          |             4
 2017-11-12          |             5
 2017-11-13          |            14
 2017-11-14          |             1

#5

That was probably associated with their pilot staffing issues:


#6

@bovineone … Interesting… I’d be interested to know what your query looked like. Can you post the SQL or other code?

For some reason, my dataset does not give me the same results. I would like to get the same (or similar) results, and perhaps I am filtering the wrong way.

My code:

from pyspark.sql.functions import *

#cxld = 
(
    df.filter(df.status=='Cancelled')
    .filter(df.airline=='RYR')
    .select(
        date_format(df.dep_loc_date, 'yyyy-MM-dd').alias('Date'), 
        df.origin_iata,
        df.destination_iata,
        df.ident, 
        df.distance_filed,
        from_unixtime(df.filed_departure_time_epoch).alias('Departure'),
        from_unixtime(df.filed_arrival_time_epoch).alias('Arrival'),
    )
    .orderBy('dep_loc_date')
).show(100)

With these results (far fewer than yours):

+----------+-----------+----------------+-------+--------------+-------------------+-------------------+
|      Date|origin_iata|destination_iata|  ident|distance_filed|          Departure|            Arrival|
+----------+-----------+----------------+-------+--------------+-------------------+-------------------+
|2017-10-24|       null|             PSA|RYR9779|           440|2017-10-24 16:00:00|2017-10-24 17:15:00|
|2017-10-24|        TUF|             MRS|RYR7709|           352|2017-10-24 10:35:00|2017-10-24 11:35:00|
|2017-10-24|        BGY|            null|RYR8095|           541|2017-10-24 21:45:00|2017-10-24 23:10:00|
|2017-10-24|       null|             EIN|RYR8832|           957|2017-10-24 19:55:00|2017-10-24 22:15:00|
|2017-10-27|        PSA|             TPS|RYR9986|           414|2017-10-27 21:20:00|2017-10-27 22:25:00|
|2017-10-27|        TPS|             PSA|RYR9987|           414|2017-10-27 19:30:00|2017-10-27 20:25:00|
|2017-10-29|       null|             STN|RYR3633|           369|2017-10-29 21:40:00|2017-10-29 22:50:00|
|2017-11-03|        KRK|             PSA|RYR7932|           626|2017-11-03 21:10:00|2017-11-03 22:50:00|
|2017-11-03|        FUE|             PSA|RYR9423|          1706|2017-11-03 19:30:00|2017-11-03 23:12:00|
|2017-11-03|        PSA|             PMO|RYR9994|           406|2017-11-03 20:40:00|2017-11-03 21:20:00|
|2017-11-03|        PSA|             CAG|RYR9933|           314|2017-11-03 21:10:00|2017-11-03 21:45:00|
|2017-11-03|        CAG|             PSA|RYR9934|           314|2017-11-03 19:30:00|2017-11-03 20:25:00|
+----------+-----------+----------------+-------+--------------+-------------------+-------------------+

#7

Also, Ryanair was always claiming that the pilot staffing issues were only affecting a small percentage of their flights. Your data suggests more than half their scheduled flights were cancelled between 2017-10-29 and 2017-11-02. We would have heard about that separately in the news!


#8

My query was against the internal SQL database that is used by FlightXML

select filed_departuretime::date, count(*) as cancellations from boards_all where ident like 'RYR%' and filed_departuretime > '2017-10-20' and cancellation is not null group by 1 order by 1;


#9

This data is from: 2017-11-05 21:09


#10

We think the drop in flights (and increase in perceived cancellations) may have been an artifact of flight callsigns changing during that time and the new data not being available until Nov 2-3.


#11

^^^ is a symptom of the change in IATA seasons.


#12

HI BovineOne, OK, what I hear you say is that the data is faulty. Are you saying that when IATA season changes, this is likely to happen?

I am asking because I want to use the data to train machine learning models, and faulty data will lead to faulty algorithms. If there is a way to get a database dump from you to get the clean data, I’d be happy to hear.


#13

We do not have any better data. Whenever there is a change to how European operators allocate ATC identifiers to their flights there will be a mismatch for a few days while our system is re-trained.