Web Archives, Web Archiving, and Web Scraping
Web Archiving, Web Archives, and Web Scraping
This workshop will introduce you to working with the web as a data source for your research. The aim is to provide you with a broad overview of the major techniques and considerations of working with the Web so that you can start asking the right questions for your research.
We'll cover the following topics, using a mixture of case studies and practical exercises:
- How the web works, and how this impacts on the ways and means of data collection.
- Working with existing web archives, such as the Internet Archive and the Australian Web Archive, to examine history.
- Making your own archives of web material for practical and reliable data collection.
- Extracting structured content from websites and web archives (scraping).
- Ethical and practical implications of web based data collection.
Please bring along a laptop so you can work through the practical exercises.
This is a catered event with limited availability. Please register to secure your place.
Facilitator: Dr Sam Hames
Dr. Sam Hames is a research fellow in computational humanities with UQ's School of Languages and Cultures and also works on the Language Data Commons of Australia and the Australian Text Analytics Platform. Sam's PhD was on machine learning for medical imaging analysis, and he has an extensive background as a data-focused software developer supporting social media and web researchers. His primary research focus is to understand how computation can enable qualitative and interpretive inquiry across the humanities and social sciences.