Recolección de datos utilizando Python y Stata

Recolección de datos utilizando Python y Stata

Este curso en línea de dos días introduce a los participantes en la extracción de datos web utilizando Python dentro de Stata. Cubre los conceptos básicos de Python y HTML, y enseña cómo recuperar y analizar datos en línea a través de ejercicios prácticos. Al final, los asistentes estarán capacitados para escribir código básico en Python y aplicarlo a sus propias tareas de extracción de datos en Stata.

Enrol Here
Enrol Here
360,00 €
Pago seguro y garantizado
2 Days
Online via Teams
Stata

Overview

Accessing and retrieving online data is increasingly vital for researchers and analysts. This two-day, interactive online seminar explores how to use Python—within Stata—to scrape and structure online data for analysis.

 

Participants will learn how to identify, extract, and convert online data (e.g., HTML tables or embedded content) into formats compatible with Stata (.txt, .csv). The course covers the basics of Python and HTML parsing, progressing to hands-on coding sessions. No prior coding experience is required, though a basic understanding of Stata or Python is beneficial.

Course Aims & Objectives
  • Introduce data scraping techniques using Python embedded within STATA.

  • Provide foundational knowledge of Python programming and HTML structure.

  • Equip participants with skills to identify and extract online data for quantitative analysis.

Key Skills Acquired

By the end of the course, students will understand:

  • Systems of linear equations and solution methods.

  • Matrix operations, transposition, determinants, and inverses.

  • Vector spaces, eigenvalues, and quadratic forms.

  • Calculus basics: derivatives, differentials, concavity/convexity.

  • Techniques in unconstrained optimisation for functions of a single variable.

Learning Outcomes
  • Technical Fluency: Gain practical experience with Python inside STATA, focusing on scripting for web scraping.

  • Data Acquisition Skills: Learn to extract useful, structured data from unstructured web pages.

  • Problem-Solving: Develop the ability to troubleshoot typical data scraping challenges and adapt code to new data sources.

  • Application-Oriented Learning: Build transferable skills applicable to academic, policy, and private-sector research projects.

Course Structure

Format: Two-day online seminar
Daily Sessions: 10:00–12:00 & 14:00–16:00 (BST)
Q&A: 1-hour concluding session on Day 2

Total contact time: 8 hours of instruction + 1 hour Q&A

Agenda

Day 1:

Lecture 1: Python Basics
Lecture 2: Web Structure and HTML Fundamentals
Day 2:

Lecture 3: Extracting and Saving Data
Lecture 4: Writing Efficient Python Code for Web Scraping

Prerequisites

No specific readings are required. A basic knowledge of Stata and Python is useful but not essential.

Course Timetable

Subject to minor changes

Day Morning Session Afternoon Session (including Tutorial)
Day One 10am-12pm (London time) 2pm-4pm (London time)
Day Two 10am-12pm (London time) 2pm-4pm (London time)

Delivered By