The 2014 Marine Corps Marathon took place on October 27th on a course that runs through Washington DC and Arlington, VA. This year my son ran in the race so I was there for the excitement. But being an observer introduced me to some to the data quality issues. During the race I was set up to get a text message as he passed the 10k, 20k, 30k, and 40k point in the race. The system failed miserably. I got the texts, but some were so late to be meaningless. After the race the more significant data quality issue become apparent as they had the wrong start time for him.
Now step back. The Marine Corps Marathon uses an RFI chip on each runner to track where they are in the race. It also is used to capture the individual runners start and finish times. With over 20,000 finishers in the race it is impossible for all to cross the start line at the same time. Also with 20,000 runners there are certain to be problems with the system. When one views the results page for the race there are two time markers for each runner – the net time and the clock time. The clock time is the actual time that the runner crossed the finish line. The net time is the time that it took the runner from the time he/she crossed the start line until the time they crossed the finish line. So, for example the first runner listed on the results page crossed the finish line in 3:53, but took 3:50 to run the race. For my son the net time was in error. He crossed the start line about three minutes into the race, but the initial results showed that he crossed at the gun. That is what got me looking at the data.
An attempt is made to put the faster runners at the front of the queue so that the faster runners do not have to pass the slower runners during the race. One of my hopes is to do some analysis on how well the predictions worked. But that is for a future post.
The results for the race were available on race day, but were labeled as unofficial. I downloaded those results and did a bit of analysis. The official results were posted about a week ago. So I download those result and looked again at some of the issues. At it turns out not all of the problems with the start time were resolved. It actually looks like very few were corrected.
The official results listed 23,468 finishers. It is not apparent until you look at the finish times that the list includes 88 runners who are in what are called the “rim” and “wheel” categories. Their results are also hidden on a separate page. I decided for my purposes to delete them from the list. One can argue if that is appropriate or not. This left 23,380 runners.
While downloading the data I discovered that at least one runner was in the file twice. A quick check of the file found two other cases. There were 23,377 runners remaining.
Next up was examination of the distribution of net time vs clock time for the runners. The first problem was that one runner, at least according to the data, started the race five minutes before the gun. Perhaps they were standing on the wrong side of the start line when the race started. You can watch the start of the race and the finish line videos here. The count of how many runners crossed the start line by seconds into the race was informative. It looks very much like the issue of getting the correct start time was not addressed for most runners. In the first second of the race 303 runners crossed the start line. In the next second this dropped to 54 runners. After that is quickly leveled off to 30-40 runners per second. Given the experience with my son I very much doubt that the 303 number is correct. Also looking at the run time of those who according to the database crossed the line in the first second I very much doubt that the count of 303 runners is anywhere near correct.
To put this in perspective the error rate is actually very low. Likely given the size of the crowd the RFI chip did not register when some of the runners crossed the start line. The system must have defaulted to placing them at the start line when the race started. This is a failure rate of around one percent. Runners who brought up the issues had their times corrected. My son’s was changed. I do not have any idea how they made the correction. For him they did ask him to provide any indication of where he was in the crowd. From the photos his friends took he was able to give them some bib numbers of other runners near him.
Next up, assuming I get to it is some analysis of the run time of those who ran the race and how well the placement of the runners in the queue worked.
Robert Samuelson writes an opinion piece in the Washington Post each week. The latest was titled “The fat cats vs. the facts.” That was the print edition title. The digital edition was titled “Government is not beholden to the rich.” In it he seems to launch an attack on the elderly and on the poor making a case that government taxes the rich and gives to the elderly and to the poor. He uses data from the recent Congressional Budget Office report: “The Distribution of Federal Spending and Taxes in 2006.” That report looks as data from 2006, the most recently available data, on the taxes and benefits split by elderly and non-elderly and then for the non-elderly looks at the same numbers by the quintiles of the income distribution.
Not surprisingly the report finds that money flows toward the elderly and the poor. Samuelson quotes numbers from the report showing that the elderly, those over 65 receive, a net of about $13,000 from the government the poorest fifth of households receive about $12,600 while the richest fifth pay about $66,000.
But should we really be looking at elderly vs. the non-elderly. Certainly it shows the current state of affairs. However a historical perspective is need in this discussion. Over the past 100 years or so, and certainly since Roosevelt was president back in the 1930′s the US and the much of the world has moved from an culture where families took care of the elderly both in terms of health care and in terms of income to a culture where the government provides a sizable portion of that help. We have done that by taxing people when they are younger and paying them when they are older via Social Security, Medicare, and other programs.
It is this cultural shift which makes inappropriate comparisons and arguments about the differences in government taxes and payments for a 75 year old vs. those for a 50 year old. This is not a cross-sectional issue. It is rather a longitudinal issue. The question is did the current elderly beneficiary pay sufficient tax in his younger years to justify the benefits received today? The social security system and the Medicare system are in many ways an insurance policy. Everyone pays into the system. Those who survive long enough receive the benefits. So the real questions need to be: Are the taxes high enough during the working years to justify the benefits received? Is the retirement age set at a level that provides solvency to the system given the level of taxes and benefits? And are the benefits set at a level commensurate the taxing rate, the retirement age and the life expectancy of the survivors.
These questions do not pit the elderly against the non-elderly. Rather they recognize the life cycle of the individual within the system.
Some key observation need to be made. The first is that life expectancy has been increasing over time. Thus no one should expect benefits to remain constant and be paid for a longer period of time while keeping the tax rates the same. Something has to change. It can be the tax rate, the retirement age or the level of benefits. Choose your poison. With continual increase in life spans a 40 year old today can expect to collect from the system more than a current 60 year old. Adjustments for that change need to be considered. Commensurate with that and in line with Samuelson’s discussion of tax rate differences by income level it is well documented that the wealthy have a longer life expectancy than do the poor. So while the wealthy pay more in taxes they can also expect to collect benefits longer.
And finally a couple of specific comments on Samuelson’s writing are in order. I really wish he had done a better job of reading what he wrote.
He notes early in the piece:
The CBO divides the population into elderly (65 and over) and non-elderly households. They’re respectively 15 percent and 85 percent of the population.
But later he writes as if there is a problem”
The non-elderly paid almost 85 percent of taxes,…
Where is the problem? The elderly are 15 percent of the population and they pay 15 percent of the taxes?
By Samuelson’s logic that the elderly get undue benefits, I wonder what he thinks of our children as they pay essentially no taxes yet reap the lifetime benefits of the education they receive at the expense of the government.
But the one that really has me puzzled is when he writes:
Democracy’s problem is not the influence of money. It’s the influence of people.
I thought we called our democracy a government of the people, by the people, and for the people. I must assume that Samuelson does not believe in that kind of government. I wonder what kind he does believe in.
Back during the height of the recent recession we had the cash for clunkers where our government spent close to $3 billion dollars financing the trade in of so called “clunkers” for new cars in the hopes of providing a boost to the nation’s economy. The twofold strategy of the program was to give a boost to the auto industry where sales were lagging and at the same time to help reduce greenhouse emissions.
Earlier this month Brookings released a report “Cash for Clunkers:An Evaluation of the Car Allowance Rebate System.” Much of the discussion in the report is based on a paper from Resources for the Future titled “Evaluating “Cash for Clunkers: Program Effects on Auto Sales, Jobs, and the Environment.” Both papers make for very dry reading. They are very much factual oriented. The shortcoming of the papers if any is that is is not always clear exactly what data come from where. The worst data source was a survey of program participants that had only a 21 percent participation rate. That fact is only cited in the Brookings paper. I doubt that the data from the survey had a major impact on the findings cited in the two papers.
The main thrust of the papers was that the program did little to reduce greenhouse gas emissions and that the cost of the program for each new job created was quite high. The impact was to simply move vehicle purchases up in time. In short people bought cars in June that they otherwise, without the program would have purchased in August.
I would raise the question as to what did we expect from the program. It was set up to encourage people to trade in old cars and purchase new cars with somewhat better gas millage. It seems quite obvious that the goal was to simply move purchases up in time. Any criticism of the program on that dimension needs to be along the lines of was that movement effective. What is effective? I don’t know. Is moving purchase up six months good? Is moving them up three months good? Is moving them up a full year good? The real effect of any program of this nature is to provide a boost in jobs now while fully recognizing that we are paying the price of fewer jobs in the future. A car manufactured today for purchase today is a car not manufactured six months from now for a purchase then. Total vehicles manufactured will go up as now I am, for example, using the vehicle for nine years that I would otherwise have used for ten years. So the metric that should be used must answer the question of if the shift in vehicle purchase patterns was worth the money spent. The shift was going to happen. One should not criticize the program because there was a shift, but rather because the shift was ineffective in accomplishing the stated goals. The auto industry would argue that the program helped them where they were in dire straights due to a lack of sales. And I might add is this really any different than providing help to farmers in times of drought? Ask instead should the government be proving that help to either or both groups.
However it seemed many wanted to use the Brookings report to bash the cash for clunkers program without asking these important questions. George Will called it the “The lunacy of ‘Cash for Clunkers.’” Fox New said “Cash for Clunkers program falls short of goals to help economy, environment.” Autoblog said “Brookings Institution says Cash For Clunkers was a bust.” My favorite statement in what they had to report was
CARS did achieve its other goal of reducing the number of fuel-inefficient cars on the road, but that impact also was negligible, as the new, green, fuel-efficient cars bought within the program reportedly represented about half of a percent of all vehicles in the US.
They are really only said we got what was expected. Given that the program targeted the replacement of 700,000 cars that was only half a percent of the inventory. So the impact on overall fuel-efficiency was “negligible.” Did anyone who was thinking when the program was proposed and approved by the Congress do the math? How did they ever expect a different outcome?
Hotair.com went further saying “Cash for Clunkers a near-total failure says … the Brookings Institute.” I don’t think Brooking put it that way. The data may say it is true, but all they gave was the data. They seem to have left the dramatic statements to others.
In the interest of conflict of interest I don’t work for the auto industry and never have. I did not take advantage of the program and I did not like it when it was proposed. My reason was one of too many loopholes in the system. With two cars one an SUV and the second a fuel-efficient sub compact it would be easy to trade both cars in, the SUV for a sub-compact and the subcompact for an SUV and reap the benefits of the program.