Importing financial data from Google Finance and FRED using python

9 minute read

Importing financial data using pandas_datareader

We can import data from Google Finance and the Federal Reserve Data Service through the pandas_datareader.

In this post, we will learn:

  • How to pull stock prices from various online APIs like Google or Yahoo! Finance?
  • How to pull macro data from the Federal Reserve?

I will show how to pull the stock prices for individual companies and group of companies. Finally, we will automate the pulling of stock prices for Top 3 financial companies.

Importing stock prices from Google Finance using IEX

Let’s get started with an example on how to import stock prices for Google using Google Finance.

NOTE: Google Finance has deprecated their API but DataReader now makes available the data source 'iex'. To experiment with the data, we can use the sandbox token, as discussed further below.

First you need to import DataReader class from the data module in the pandas_datareader package. Since you will request a time series of stock prices, you need to import the datetime library that provides the functionality to define a range of dates needed for your request.

Now you just need two more inputs: the stock ticker, and the data source - let’s use the ticker for Apple (AAPL), and Google finance (IEX) as source. With these variables defined, you can create a DataReader object. This reads the data from Google finance, and imports it into a DataFrame that we will call stock_data.

The need for pandas_datareader package

First, create an IEX cloud account from IEX website (https://iexcloud.io). This provides a sandbox environment to get yourself familiar with the API. A very simple example is shown below, where we are pulling the AAPL stock quote using the sandbox token. As you can see, the response contains a boat load of information. You will need to extract relevant information from the response and build a dataframe. This is where using something like pandas_datareader can come handy.

import pandas as pd
import requests
from datetime import datetime, timedelta

# IEX cloud sandbox
url = 'https://sandbox.iexapis.com/stable/stock/AAPL/quote?token=Tpk_d0a16f7723884324876799d553ec66bb'

response = requests.get(url).json()
response
{'avgTotalVolume': 65891391,
 'calculationPrice': 'close',
 'change': 1.06,
 'changePercent': 0.0073,
 'close': 152.4,
 'closeSource': 'iliffcoa',
 'closeTime': 1671660667360,
 'companyName': 'Apple Inc',
 'currency': 'USD',
 'delayedPrice': 152.11,
 'delayedPriceTime': 1668451624211,
 'extendedChange': -0.14,
 'extendedChangePercent': -0.00091,
 'extendedPrice': 152.71,
 'extendedPriceTime': 1634337293748,
 'high': 156.08,
 'highSource': 'rt dlenu  ieaed5cpm1iye',
 'highTime': 1659092326545,
 'iexAskPrice': None,
 'iexAskSize': None,
 'iexBidPrice': None,
 'iexBidSize': None,
 'iexClose': 149.77,
 'iexCloseTime': 1697798045558,
 'iexLastUpdated': None,
 'iexMarketPercent': None,
 'iexOpen': 152.13,
 'iexOpenTime': 1651196253899,
 'iexRealtimePrice': None,
 'iexRealtimeSize': None,
 'iexVolume': None,
 'lastTradeTime': 1690496771796,
 'latestPrice': 148.7,
 'latestSource': 'Close',
 'latestTime': 'August 27, 2021',
 'latestUpdate': 1648383242845,
 'latestVolume': 57371979,
 'low': 147.17,
 'lowSource': '5letear  i i1dyduencmep',
 'lowTime': 1687540161890,
 'marketCap': 2457149036038,
 'oddLotDelayedPrice': 148.6,
 'oddLotDelayedPriceTime': 1664962680763,
 'open': 150.81,
 'openTime': 1644772963742,
 'openSource': 'faiifclo',
 'peRatio': 29.37,
 'previousClose': 149.34,
 'previousVolume': 48648929,
 'primaryExchange': 'QADSAN',
 'symbol': 'AAPL',
 'volume': 57990654,
 'week52High': 154.6,
 'week52Low': 104.5,
 'ytdChange': 0.13521478015292396,
 'isUSMarketOpen': False}

pandas_datareader package

The pandas data_reader is a great package that provides easy access to a range a financial data sources on the web. We will see that it takes very little code to import data directly into a pandas Dataframe. Available web sources include Yahoo and Google Finance for stock prices and derivatives. There is also plenty of global economics data from the Federal Reserve research portal, the word bank, the OECD, or EuroStat, as well as exchange rates from OANDA.

!pip install pandas_datareader

Using IEX and pandas_datareader package to get stock data

#!pip install pandas_datareader

# Import DataReader
from pandas_datareader.data import DataReader

# Import date
from datetime import date

import os

# Set start and end dates
start = date(2021, 1, 1)
end = date.today()

# Set the ticker.
# A stock ticker is the unique symbol needed to get stock information for a certain company.
ticker = 'AAPL'

# Set the data source
data_source = 'iex'

# set your api_key and enable sandbox
os.environ["IEX_API_KEY"] = 'Tpk_d0a16f7723884324876799d553ec66bb'
os.environ['IEX_SANDBOX'] = 'enable'

# Import the stock prices
stock_prices = DataReader(ticker, data_source, start, end)

# Display and inspect the result
display(pd.concat([stock_prices.head(), stock_prices.tail()]))
stock_prices.info()
open high low close volume
date
2021-01-04 137.14 137.9941 128.840 132.61 150210581
2021-01-05 133.04 137.6500 131.140 131.74 101823285
2021-01-06 131.55 134.1335 126.906 127.00 158914953
2021-01-07 130.57 137.2300 130.470 136.37 110398635
2021-01-08 138.13 138.1300 133.210 134.44 106669738
2021-08-23 151.27 157.2800 150.860 150.86 61950690
2021-08-24 154.91 156.1800 152.610 156.18 50958976
2021-08-25 152.68 156.8800 148.960 148.96 59092792
2021-08-26 149.90 155.5800 147.880 154.23 49078540
2021-08-27 153.51 153.6800 151.900 151.90 56680378
<class 'pandas.core.frame.DataFrame'>
Index: 165 entries, 2021-01-04 to 2021-08-27
Data columns (total 5 columns):
open      165 non-null float64
high      165 non-null float64
low       165 non-null float64
close     165 non-null float64
volume    165 non-null int64
dtypes: float64(4), int64(1)
memory usage: 7.7+ KB

Visualize a stock price trend

# Import matplotlib.pyplot
import matplotlib.pyplot as plt

# Plot close
stock_prices['close'].plot(title=ticker, figsize=(20,10))

# Show the plot
plt.show()

png

Get data from FRED

Now we will see how to get economic data from the Federal Reserve (Fed). The Fed has an online portal called Federal Reserve Economic Data or FRED. This service offers almost 500,000 data series in a broad range of categories like economic growth and employment, monetary and fiscal policy, or demographics, industry trends and commodity prices. Depending on the subject, time series may be available at daily, monthly or annual frequencies.

Each series has a code that you need to access the data, and that you can obtain from the FRED website. Just navigate to https://fred.stlouisfed.org/ . A search for the term interest rate produces over 1,400 results. Let’s pick the 10-year Treasury Rate at daily frequency.

Below we visualize the gold price trend over the last 50 years, specifically, the Gold Fixing Price 10:30 AM (London time) in London Bullion Market, in US Dollars series code GOLDAMGBD228NLBM.

# Set start date
start = date(1968, 1, 1)

# Set series code
series = 'GOLDAMGBD228NLBM'

# Import the data
gold_price = DataReader(series, 'fred', start=start)

# Inspect the price of gold
gold_price.info()

# Plot the price of gold
gold_price.plot(title='Gold Price', figsize=(20, 10))

# Show the plot
plt.show()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 13935 entries, 1968-04-01 to 2021-08-27
Data columns (total 1 columns):
GOLDAMGBD228NLBM    13504 non-null float64
dtypes: float64(1)
memory usage: 217.7 KB

png

Compare labor market participation and unemployment rates

Two economic data series in FRED are the Civilian Unemployment Rate (‘UNRATE’) and the Civilian Labor Force Participation Rate (‘CIVPART’).

These rates highlight two important aspects of the US labor market: the share of the civilian population that is currently unemployed or seeking employment, and the share of those active in the labor market that are in fact employed.

This means that the numbers indicate both the size of the labor market relative to the total population, as well as the size of unemployment relative to the labor market.

NOTE: how using the subplots=True makes separately plots the time series.

# Set the start date
start = date(1950, 1, 1)

# Define the series codes
series = ['UNRATE', 'CIVPART']

# Import the data
econ_data = DataReader(series, 'fred', start)

# Assign new column labels
econ_data.columns = ['Unemployment Rate', 'Participation Rate']

# Plot econ_data
econ_data.plot(subplots=True, title='Labor Market', figsize=(20, 10))

# Show the plot
plt.show()

png

econ_data.head()
Unemployment Rate Participation Rate
DATE
1950-01-01 6.5 58.9
1950-02-01 6.4 58.9
1950-03-01 6.3 58.8
1950-04-01 5.8 59.2
1950-05-01 5.5 59.1

Get several stocks at once and manage a multi-index

We will see how to retrieve several stocks from Yahoo finance, and how to manage a DataFrame with more than a single index.

You will be using the listing information to now select several stocks at once, like for instance the largest 3 stocks per sector. You will then use Google finance to retrieve data for these stocks with a single DataReader object. In the process, you will learn about the pandas MultiIndex, which allows you to manage more complex datasets.

You can pass this list directly to the data reader to get all information with a single call. The result is a panel data structure, which has three dimensions.

As I have shown how to calculate the top 5 companies listed on Nasdaq in Technology Sector

['AAPL', 'GOOGL', 'GOOG', 'MSFT', 'AMZN']

tickers = ['AAPL', 'GOOGL', 'GOOG', 'MSFT', 'AMZN']

data_source = 'iex'

# Set start and end dates
start = date(2021, 1, 1)
end = date.today()

panel = DataReader(tickers, data_source, start, end)
print(type(panel))
panel.head()
<class 'pandas.core.frame.DataFrame'>
Attributes open high ... close volume
Symbols AAPL AMZN GOOG GOOGL MSFT AAPL AMZN GOOG GOOGL MSFT ... AAPL AMZN GOOG GOOGL MSFT AAPL AMZN GOOG GOOGL MSFT
date
2021-01-04 138.00 3363.00 1768.00 1843.000 225.63 138.0000 3363.00 1779.26 1843.000 227.2900 ... 131.37 3233.87 1779.26 1765.45 227.29 148912525 4523431 1958537 1955045 38922952
2021-01-05 134.77 3301.34 1767.00 1795.130 225.69 136.4500 3381.78 1806.10 1795.130 227.7300 ... 134.64 3360.54 1806.10 1790.96 222.20 100732153 2728824 1164039 1023133 24417760
2021-01-06 129.61 3160.93 1736.46 1711.233 218.61 132.8084 3265.49 1811.08 1778.017 226.2235 ... 130.50 3236.81 1811.08 1773.05 215.25 155997890 4610666 2651928 2338256 36339530
2021-01-07 131.23 3277.00 1811.83 1749.430 215.25 135.8400 3277.00 1855.10 1863.710 227.1200 ... 135.84 3171.01 1840.55 1844.65 220.91 109932596 3539502 2369612 2148200 28134387
2021-01-08 135.51 3259.00 1845.92 1819.680 221.35 135.5100 3259.00 1891.25 1863.365 230.1500 ... 134.57 3205.30 1871.02 1798.86 220.34 106751618 3638946 2139724 1861053 23977301

5 rows × 25 columns

The data we got back is already in the Wide Format, but the columns are in a Multi-Index (2-level, with top level being open, high, close, volume. If we choose the close price, here’s what we will end up with.

panel['close'].head()
Symbols AAPL AMZN GOOG GOOGL MSFT
date
2021-01-04 131.37 3233.87 1779.26 1765.45 227.29
2021-01-05 134.64 3360.54 1806.10 1790.96 222.20
2021-01-06 130.50 3236.81 1811.08 1773.05 215.25
2021-01-07 135.84 3171.01 1840.55 1844.65 220.91
2021-01-08 134.57 3205.30 1871.02 1798.86 220.34
stock_prices = panel['close']
stock_prices.plot(subplots=True, figsize=(20, 10))
plt.tight_layout() # removes extra white spaces

png

Get stock data for the 3 largest Financial companies

# Create the pd.ExcelFile() object
xls = pd.ExcelFile('data/listings.xlsx')

# Extract the sheet names from xls
exchanges = xls.sheet_names

# Create an empty list: listings
listings = []

# Import the data
for exchange in exchanges:
    listing = pd.read_excel(xls, sheet_name=exchange, na_values='n/a')
    listing['Exchange'] = exchange
    listings.append(listing)

# Concatenate the listings: listing_data
listing_data = pd.concat(listings)

# Inspect the results
listing_data.sample(5)
Stock Symbol Company Name Last Sale Market Capitalization IPO Year Sector Industry Exchange
459 MQY Blackrock MuniYield Quality Fund, Inc. 15.06 4.623454e+08 1992.0 NaN NaN nyse
221 ARCH Arch Coal, Inc. 70.76 1.769129e+09 2016.0 Energy Coal Mining nyse
1546 AGFS AgroFresh Solutions, Inc. 4.87 2.469265e+08 2014.0 Finance Business Services nasdaq
139 ESNC EnSync, Inc. 0.71 3.408735e+07 NaN Public Utilities Electric Utilities: Central amex
174 GTE Gran Tierra Energy Inc. 2.64 1.031731e+09 NaN Energy Oil & Gas Production amex
# Set Stock Symbol as the index
listings_ss = listing_data.set_index('Stock Symbol')

# Get ticker of 3 largest tech companies
top_3_companies = listings_ss.loc[listings_ss.Sector == 'Finance', 'Market Capitalization'].nlargest(n=3)

# Convert index to list
top_3_tickers = top_3_companies.index.tolist()

data_source = 'iex'

# Set start and end dates
start = date(2021, 1, 1)
end = date.today()

# Import stock data
panel = DataReader(top_3_tickers, data_source, start, end)
stock_prices = panel['close']
stock_prices.plot(subplots=True, figsize=(20, 10))
plt.tight_layout() # removes extra white spaces

png

Tags:

Updated: