Web Archiving, Web Archives, and Web Scraping 

This workshop will introduce you to working with the web as a data source for your research. The aim is to provide you with a broad overview of the major techniques and considerations of working with the Web so that you can start asking the right questions for your research.

We'll cover the following topics, using a mixture of case studies and practical exercises:

  • How the web works, and how this impacts on the ways and means of data collection.
  • Working with existing web archives, such as the Internet Archive and the Australian Web Archive, to examine history.
  • Making your own archives of web material for practical and reliable data collection.
  • Extracting structured content from websites and web archives (scraping).
  • Ethical and practical implications of web based data collection.

Please bring along a laptop so you can work through the practical exercises.

This is a catered event with limited availability. Please register to secure your place.

Facilitator: Dr Sam Hames

Dr. Sam Hames is a research fellow in computational humanities with UQ's School of Languages and Cultures and also works on the Language Data Commons of Australia and the Australian Text Analytics Platform. Sam's PhD was on machine learning for medical imaging analysis, and he has an extensive background as a data-focused software developer supporting social media and web researchers. His primary research focus is to understand how computation can enable qualitative and interpretive inquiry across the humanities and social sciences.

About The Centre for Digital Cultures and Societies Events

DCS runs a busy calendar of events throughout the year, including many opportunties for digital research training. You will find details of our feature events below. Stay up-to-date with our full range of events by keep an eye on our website and subscribing to our newsletter.