![]() ![]() So that question and about 120 others were thrown out. Pancakes or flapjacks? Everyone says pancakes, Katz said. Of the 140 questions asked, a good portion didn’t tell Katz much. “One of the great things about doing this online was that you get all of this instant feedback and a lot of people have great suggestions,” Katz said. Using the suggestions from the online community he came up with 20 additional questions he thought would help him determine changes in dialect and built a survey of more than 140 questions (the original 122 plus his 20 new ones), and posted it on. He’d also need to figure out if dialects in the United States had changed over the last 10 years. To do this he’d need to whittle down the original set of 122 questions in to a manageable number. Though satisfied with the work he’d done with the data thus far, Katz had also come up with a plan to verify and update the data and turn it in to a quiz.Ī map from Katz's smoothing project based on the Harvard dialect data. In June he posted those maps on the North Carolina State University website and on, a community site for R developers.īy August the graphics desk at the Times had discovered them and invited him to New York for an internship starting in September. Using the k-nearest neighbor algorithm and kernel density estimation (more detail here) he created a series of maps that showed the Harvard data in a series of maps most of us would call heat maps. ![]() While the data was interesting, Katz wanted to show a more elegant “smoothed estimate” of the same data. The study was based on the responses of more than 50,000 people to 122 questions on dialect, and had been presented by the researchers (Bert Vaux and Scott Golder) as a series of colored points on a map. Last March Katz was a grad student in the Department of Statistics at North Carolina State University and had recently decided he wanted to look more closely at an interesting set of data he’d seen 10 years prior, the Harvard Dialect Survey. The Harvard Dialect Survey maps created by researchers in 2003. ![]() Katz’s personal journey to the Times is a fun one, but the story of the technology behind the popular project is just as good. “What I didn’t realize is that that is essentially a lot of what they do at Times graphics, so it was really a perfect fit.” ![]() “I’d always had an interest in data visualization and finding a way of communicating results graphically,” he said. “I’d enjoyed the news as a consumer,” Katz said, “but I'd never really pictured myself as being a part of the journalism world.” That’s not the whole story of course, but it’s the rough run-up to how Josh Katz ended up an intern at the Times last fall and eventually created (with graphics editor Wilson Andrews) the newspaper’s most popular piece of content in 2013 - “ How Y’all, Youse and You Guys Talk.” Then apply to grad school and while you're there dig in to some intriguing data that Harvard researchers had published 10 years prior, apply some stats and smart algorithms, post your work online, then wait for The New York Times to call. Well, for starters, study or consider careers in politics, law, and philosophy before eventually deciding that statistics is for you. How do you create the most popular piece of content of the year at one of the nation’s most prestigious news outlet? NYT's most popular piece of content in 2013 - “ How Y’all, Youse and You Guys Talk” generates a personalized dialect map based upon user responses compared to data from more than 350,000 survey responses collected in 2013. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |