Dan Simonson

[dæn sɑɪmɨnsʌn]

Computational Linguist

Dan Simonson is a PhD Candidate at Georgetown University in the Department of Linguistics within the computational concentration. He's most interested in using linguistics and natual language processing to take objective and interesting slices of the real world to yield insight and understanding. In particular, this has led him to pursue problems related to narrative schemas, information extraction and retrieval, semantic modality, critical discourse analysis, and applying these topics to one another.

He recently (February 2015) passed his PhD Oral Exam. His orals committee consisted of Tony Davis, Amir Zeldes, and Nate Chambers.

Dan is most easily found in rooms with free coffee and on a bicycle. He tried both at the same time once in his youth and ended up burning his hand.

He also feels very weird writing about himself in the third person and will cease immediately.


Off-site Stuff


Use my name at gmail.com. (I will respond to you from a different address that the first one forwards to. Let me know if you have some kind of filter that requires me to respond from the first one.)

Or, on Twitter, @thedansimonson.

Tools and Goodies

Bash Your Way into Bash

Bash Your Way into Bash is a tutorial for using bash. It's for people who have no experience using command line interfaces.

ft library for Python

ft is a library for dealing with lists of dictionaries in Python. It makes counting and finding things eas(y|ier). You can get ft off of pip (pip install ft). I've made an official page here with some code examples; the in-code documentation is pretty solid.

Simple Python Twitter Scraper (SPyTS)

SPyTS is a tool for scraping tweets. For us plebs who don't have firehose access to Twitter, It spreads queries out over as evenly as possible of a period and prevents exceeding Twitter's API rate limits. SPyTS is available on github.

Research and Publications

Narrative Schemas

A lot of what we know is grounded in stories. Some stories tend to repeat themselves, and once they do enough, it's been hypothesized that we genericize the stories into something called a schema. I work to extract this type of world knowledge from language data and apply it to problems that would otherwise be inaccessible quantitative analysis.

My dissertation focuses on the extraction of these sorts of schemas, to understand their distribution and properties, and how to apply this knowledge to practical, real-world tasks.

You can find more information on this vein of research here.

Simonson, D. and Davis, A. (2016, November). NASTEA: Investigating Narrative Schemas through Annotated Entities. In the Second CnewS Workshop, EMNLP 2016, Austin, TX. [paper] [Workshop Slides] [DCNLP Slides]

Simonson, D. and Davis, A. (2015, July). Interactions between Narrative Schemas and Document Categories. In the First CnewS Workshop, ACL 2015, Beijing, China. [paper]


Throughout my time at Georgetown, I have been involved in a project building a theory and corpus of gradable modal expressions [NSF-funded, BCS-1053038]. Modal expressions are those that express possibilities. They span all parts of speech.

I played a number of roles on this project. During the experimental aspects of the project, where the annotation guidelines were developed and tested, I was responsible for reporting interannotator agreement scores. During the corpus construction component, I built and maintained a cross-platform tool for adjudicating annotator output. Throughout both stages of the project, I managed data as it flowed between phases of the project.

Rubinstein, A., Harner, H., Krawczyk, E., Simonson, D., Katz, G., and Portner, P. (2013). Toward Fine-grained Annotation of Modality in Text. In Proceedings of the Tenth International Conference for Computational Semantics (IWCS 2013). [Paper]

Simonson, D., Rubenstein, A., Chung, J., Harner, H., Katz, E.G., Portner, P. (2012, February). Categorizing Modals with Amazon Mechanical Turk. In the Proceedings of the Mid-Atlantic Colloquium of Studies in Meaning (MACSIM 2012). [Poster]

Other Assorted Linguistic Work

Zeldes, A. and Simonson, D. (2016, August) Different Flavors of GUM: Evaluating Genre and Sentence Type Effects on Multilayer Corpus Annotation Quality. In the Proceedings of LAW X: 10th Linguistic Annotation Workshop, Berlin. [paper]

Sierra, S., Simonson, D. (2014, October). Gender and cool solidarity in Mexican Spanish slang phrases In the Proceedings of New Ways of Analyzing Variation 43. Chicago, IL. [Slides from NWAV Presentation]

Undergraduate Research in Astronomy

During my undergrad, I was a physics major and participated in astronomy research under the guidance of Harold Butner. I presented posters in two annual meetings of the American Astronomical Society as results of this research. These proceedings derived from two separate projects involving the DEBRIS target set, a search for binary stars using the Herschel Space Observatory. The first was a search for estimates of stellar age in the literature of our project's target stars; the second reported preliminary results of an observing run in the infrared identifying candidate binaries.

Simonson, D. E., Butner, H. M., Trelawny, D. T., Evans, C. M., Duchene, G., Rodriguez, D. R., ... and DEBRIS, T. (2010, January). Searching for Previously Unresolved Binaries in DEBRIS Survey Target Stars. In Bulletin of the American Astronomical Society (Vol. 42, p. 400). [Poster]

Butner, H. M., McCauley, P., Simonson, D., Matthews, B., Greaves, J. S., Duchene, G., ... and Zuckerman, B. (2009, January). Stellar Ages Of The Debris Sample Stars. In Bulletin of the American Astronomical Society (Vol. 41, p. 209). [Poster]

Random Stuff

Pluto never should have been planet.

My first website was D's C&c Page. I lost the whole thing when, for some reason, Geocities decided I violated their TOS. No explanation was given.

wow such linguistics

djorno: pizza for python