HSG_Logo_EN_RGB

Reto Hofstetter

Course location

Home university

University of St.Gallen
University of Lucerne

Course location

University of St.Gallen

Home university

University of Lucerne
Hofstetter-Reto
Professor Reto Hofstetter received a PhD, MA, and BA in business administration and economics from the University of Bern Switzerland, and also a BA in computer science from the University of Applied Sciences of Bern. Professor Hofstetter is currently a Professor of Marketing at the Faculty of Economics and Management of the University of Lucerne. Before Lucerne, he was faculty member at the Università della Svizzera italiana (USI) and served as assistant professor at the University of St. Gallen. He visited the Wharton School and Stanford University as part of his research. Professor Hofstetter’s current research focuses on digital consumer behavior, crowdsourcing, creativity, self-presentation, -reporting and disclosure. His research has been published in top-tier academic journals, including the Journal of Marketing Research, Management Science, or PNAS and has been featured by Harvard Business Review or Forbes.

Courses taught by this instructor

Course

Description

Instructor

Level

Next course

Location

Course

Description

Instructor

Level

Location

Next course

Data Scraping & Management for Social Scientists with R

Online platforms such as Yelp, Twitter, Amazon, or Instagram are large-scale, rich and relevant sources of data. Researchers in the social sciences increasingly tap into these data for field evidence when studying various phenomena. In this course, you will learn how to find, acquire, store, and manage data from such sources and prepare them for follow-up statistical analysis for your own research. After a short introduction into the relevance of data science skills for the social sciences, we will review R as a programming language and its basic data formats. We will then use R to program simple scrapers that systematically extract data from websites. We will use the packages rvest, httr, and RSelenium, among others, for this purpose. You will further need to learn how to read HTML, CSS, JSON, or XML codes, to use regular expressions, and to handle string, text and image data. To store the data, we will look into relational databases, (My)SQL, and related R packages. Many websites such as Twitter and Yelp offer convenient application-programming interfaces (APIs) that facilitate the extraction of data and we will look into accessing them from R. Finally, we will highlight some options for feature extraction from images and text, which allows us to augment our collected data with meaningful variables we can use in our analysis. At the end of this course, students should be able to identify valuable online data sources, to write basic scrapers, and to prepare the collected data such that they can use them for statistical analysis as part of their own research projects. Throughout the course, students will work on a data-scraping project related to their theses. This project will be presented at the final day of the course.
...

...

B

2024

Data Scraping & Management for Social Scientists with R

Online platforms such as Yelp, Twitter, Amazon, or Instagram are large-scale, rich and relevant sources of data. Researchers in the social sciences increasingly tap into these data for field evidence when studying various phenomena. In this course, you will learn how to find, acquire, store, and manage data from such sources and prepare them for follow-up statistical analysis for your own research. After a short introduction into the relevance of data science skills for the social sciences, we will review R as a programming language and its basic data formats. We will then use R to program simple scrapers that systematically extract data from websites. We will use the packages rvest, httr, and RSelenium, among others, for this purpose. You will further need to learn how to read HTML, CSS, JSON, or XML codes, to use regular expressions, and to handle string, text and image data. To store the data, we will look into relational databases, (My)SQL, and related R packages. Many websites such as Twitter and Yelp offer convenient application-programming interfaces (APIs) that facilitate the extraction of data and we will look into accessing them from R. Finally, we will highlight some options for feature extraction from images and text, which allows us to augment our collected data with meaningful variables we can use in our analysis. At the end of this course, students should be able to identify valuable online data sources, to write basic scrapers, and to prepare the collected data such that they can use them for statistical analysis as part of their own research projects. Throughout the course, students will work on a data-scraping project related to their theses. This project will be presented at the final day of the course.
...

...