I am a lecturer of modern Chinese literature at UMD. I am looking for panelists for a Digital Technology Workshop (AAS 2021) that focuses on how to build a corpus/database. I invite fellow digital humanists/social scientists who have experiences with building their own corpus/database to share some technologies and methods they use. For me, I will present on how I build my own database: Chronological Database of Chinese Literature (CDCL).  CDCL is a fully machine-accessible plain-text digital database that consists of almost 2,000 titles of Chinese literature from the Three Kingdoms period to the modern era (http://ashleyliuresearch.com/digitalhumanities/). I will introduce methods like web-scraping Wikisource and converting ebooks into plain text files and data.

Please note that this is a workshop, so a paper is not needed. Only presentations and tutorials are expected

If interested, please email me at liuyx@umd.edu or liuyx@sas.upenn.edu

