In the context of search engine optimization on Amazon (Amazon SEO), it is essential that relevant keywords are placed within the listing.
In this article, we'll show you how to identify missing keywords for your listings using the Brand Analytics Search Term Report and some Python code.
- What is included in the Brand Analytics search term report?
- How to find missing keywords?
- What data is required?
- Identify keyword ideas
- Conclusion
What is included in the Brand Analytics search term report?
The Brand Analytics search term report is published weekly by Amazon. It contains up to 1 Million most frequently used search terms on Amazon for the US or other countries. Moreover, it points out the three ASINS most frequently clicked by customers after they searched for the respective search term. You will get further information about the report in our article about Brand Analytics Searchterm Report.
How to find missing keywords?
The idea of identifying missing keywords for your listing is quite simple and can be broken down into five steps:
We determine ...
- ... for which search terms your product is clicked
- ... which other ASINs are clicked for these search terms determined in 1.
- ... for which search terms the ASINs from 2. are clicked on
- ... which search terms are contained in 3. but not in 1.
- ... which search terms from 4. are not included in the listing
Once coded, the script can be executed for all products. But more about this later.
What data is required?
To determine the keyword ideas, you need two sets of data:
- a current Brand Analytics search term report. The report is available for brand owner on Amazon and can be downloaded from Brand Analytics.
- a CSV file with your own listing data.
The CSV file with the listing data should contain the following columns:
- Marketplace (e.g. "US")
- ASIN
- Title
- Bullet 1
- Bullet 2
- Bullet 3
- Bullet 4
- Bullet 5
Optional, you can add the hidden/backend keywords to the script (also called "hidden keywords") if you want to check them as well.
Identify keyword ideas
Let's go through the steps using an example. Here we take the "RUF Whipped Cream" as the product.
Now we go through the steps explained above. Before that, we have to download our data.
Load Brand Analytics data
First of all, we load the Brand Analytics search term data:
fileNameBA = "./Amazon-Searchterms-US.csv"
thousandSeparator = "," # US
columns = ["Search Term","Search Frequency Rank","#1 Clicked ASIN","#2 Clicked ASIN","#3 Clicked ASIN"] # US
# Load data
dfBA = pd.read_csv(fileNameBA, thousands=thousandSeparator, usecols=columns, engine="python", error_bad_lines=True, encoding='utf-8', skiprows=1, sep=",")
# Rename columns
dfBA.columns = ['searchterm', 'rank', '1', '2', '3']
# Unmelt dfBA from wide to long
dfBA_Long = dfBA.melt(id_vars=["searchterm", "rank"], var_name="position", value_name="ASIN")
# Make position an int
dfBA_Long = dfBA_Long.astype({"position": int})
# Drop N/A
dfBA_Long = dfBA_Long.dropna()
# Reset index and sort
dfBA_Long_WithIndex = dfBA_Long.set_index('searchterm')
dfBA_Long_WithIndex = dfBA_Long_WithIndex.sort_index()
We have depivotated the data so that it is now in the following format (excerpt):
searchterm | rank | position | ASIN |
---|---|---|---|
shaw hart | 237,620 | 3 | B08T9TTC67 |
nautical rug | 344,663 | 1 | B01DVDBSWG |
belts for boys | 167,907 | 1 | B08P8K1HRF |
hori switch controller | 368,320 | 1 | B08KT7ML1R |
rear window prime video | 352,193 | 2 | B00D5UK8DQ |
Load listing data
By now we still need the listing information. We assume the information has already been loaded in a dataframe called df_products
:
By now we have everything we need. Time for our five-step approach!
For which keywords does our product rank? (Step 1)
First, we determine in the Brand Analytics search term report for which keywords our product is found:
# Get "owned keywords", i.e. keywords the ASIN in question is clicked on already
ASIN = 'B0000BYCGF'
ownedKeywords = []
foundKeywords = dfBA_Long[dfBA_Long['ASIN'] == ASIN].searchterm.unique()
ownedKeywords.append(foundKeywords)
the result:
searchterm | rank | position | ASIN |
---|---|---|---|
glass bowls with lids | 48,139 | 1 | B0000BYCGF |
pyrex bowls with lids | 150,202 | 1 | B0000BYCGF |
glass bowl with lid | 394,231 | 1 | B0000BYCGF |
glass bowls with lids food storage | 480,649 | 1 | B0000BYCGF |
pyrex bowls | 53,075 | 2 | B0000BYCGF |
pyrex glass bowls with lids | 406,410 | 2 | B0000BYCGF |
pyrex containers | 423,993 | 2 | B0000BYCGF |
pyrex storage | 433,550 | 2 | B0000BYCGF |
pyrex storage containers with lids | 7,893 | 3 | B0000BYCGF |
pyrex bowl | 330,919 | 3 | B0000BYCGF |
glass pyrex containers with lids | 339,344 | 3 | B0000BYCGF |
pyrex glass | 456,289 | 3 | B0000BYCGF |
Or shown in an array format:
['glass bowls with lids' 'pyrex bowls with lids' 'glass bowl with lid'
'glass bowls with lids food storage' 'pyrex bowls'
'pyrex glass bowls with lids' 'pyrex containers' 'pyrex storage'
'pyrex storage containers with lids' 'pyrex bowl'
'glass pyrex containers with lids' 'pyrex glass']
Which ASINs rank for the keywords from step 1? (Step 2)
Let's see the other ASINs which rank for these keywords:
# Get other ASINs from competitors for ownedKeywords
otherASINs = []
for searchterm in ownedKeywords[0]:
# print(searchterm)
foundASINs = dfBA_Long[dfBA_Long['searchterm'] == searchterm].ASIN.unique().flatten()
otherASINs.append(foundASINs)
# Flatten array of arrays
flat_list_ASINs = [item for sublist in otherASINs for item in sublist]
# Make array unique
flat_list_ASINs = set(flat_list_ASINs)
# Remove own ASIN
flat_list_ASINs.remove(ASIN)
We receive these ASINs:
{'B00LGLHUA0',
'B00M2J7PCI',
'B0157G34AY',
'B0161EG5IE',
'B07L51SFVS',
'B07VKSNSTB',
'B07WT6K984',
'B082SN4QH6',
'B08FCBVY8G',
'B08HR5815V',
'B08VD783DS'}
For which keywords do these ASINs rank? (Step 3)
Now we check for which keywords the ASINs above from step 2 are clicked:
# Get keywords the other ASINs are clicked on
keywordsFromOtherASINs = dfBA_Long[dfBA_Long['ASIN'].isin(flat_list_ASINs)].searchterm.unique()
We receive the result (excerpt for readability):
['pyrex' 'pyrex storage containers with lids'
'glass storage containers with lids' 'mixing bowl'
'pyrex glass storage containers with lids' 'glass measuring cup'
'pyrex measuring cup' 'glass bowls' 'glass mixing bowls' 'glass bowl'
'pyrex bowls' 'glass tupperware sets with lids' 'measuring cup set'
'liquid measuring cups' 'food storage containers glass'
'glass containers with lids' 'pyrex mixing bowls' 'measuring cups glass'
...
'glass food containers with lids' 'kitchen necessities' 'cooking bowls'
'kitchen essentials for new home' 'storage containers for food'
'pyrex 2 cup' 'measuring bowls' 'glass food container'
'glass storage container' 'clear bowl']
Which keywords are new? (Step 4)
We now need to determine which keywords are included in step three but are not found in step one.
# Get keywords which other ASINs are clicked on but not the own ASIN yet
A = np.array(ownedKeywords)
B = np.array(keywordsFromOtherASINs)
missingKeywords = np.setdiff1d(B, A)
We receive the result:
['baking bowls' 'big bowl' 'clear bowl' 'cooking bowls'
'food containers glass' 'food storage containers glass'
'food storage glass' 'glass airtight food storage containers'
'glass bowl' 'glass bowl set' 'glass bowls' 'glass containers'
'glass containers for food storage with lids'
'glass containers with lids' 'glass food container'
'glass food containers' 'glass food containers with lids'
'glass food storage' 'glass food storage containers'
'glass food storage containers with lids'
'glass food storage containers with lids airtight'
'glass kitchen storage containers' 'glass measuring cup'
'glass measuring cups' 'glass measuring cups pyrex'
'glass measuring cups set' 'glass mixing bowl' 'glass mixing bowl set'
'glass mixing bowls' 'glass mixing bowls with lids'
'glass mixing bowls with lids set' 'glass serving bowl' 'glass storage'
'glass storage container' 'glass storage containers'
'glass storage containers with lids' 'glass tupperware set'
'glass tupperware sets with lids' 'kitchen essentials for new home'
'kitchen necessities' 'large glass bowl' 'large measuring cup'
'liquid measuring cup' 'liquid measuring cup glass'
'liquid measuring cups' 'measure cup' 'measure cups' 'measurement cup'
'measuring' 'measuring bowls' 'measuring cup' 'measuring cup glass'
'measuring cup set' 'measuring cups' 'measuring cups glass'
'measuring glass' 'measuring tools & scales' 'mesururing cup'
'mixing bowl' 'mixing bowls glass' 'pyrex' 'pyrex 2 cup'
'pyrex 2 cup measuring cup glass' 'pyrex glass bowls'
'pyrex glass measuring cup' 'pyrex glass storage containers'
'pyrex glass storage containers with lids' 'pyrex measuring cup'
'pyrex measuring cup set' 'pyrex measuring cups' 'pyrex mixing bowls'
'pyrex mixing bowls with lids' 'pyrex set' 'storage containers for food'
'tupperware glass' 'tupperware sets glass']
Afterwards we separate the terms and keep all keywords with more than four characters:
# Flatten array of arrays and remove duplicates using a set
missingKeywords_flattened = set(' '.join(missingKeywords).split(' '))
# Only get keywords which have a minimum length of 4
missingKeywords_flattened_reduced = [str for str in missingKeywords_flattened if len(str) >= 4]
missingKeywords_flattened_reduced.sort()
We receive the following result:
['airtight', 'baking', 'bowl', 'bowls', 'clear', 'container', 'containers', 'cooking', 'cups', 'essentials', 'food', 'glass', 'home', 'kitchen', 'large', 'lids', 'liquid', 'measure', 'measurement', 'measuring', 'mesururing', 'mixing', 'necessities', 'pyrex', 'scales', 'serving', 'sets', 'storage', 'tools', 'tupperware', 'with']
Which keywords are missing in the listing? (step 5)
Now we check which keywords from step 4 are missing in our listing.
Besides, we convert everything to lower case and combine the bullet points to form a long sentence.
# Get the product title for ASIN in question
productTitle = df_Products[df_Products['ASIN (child)'] == ASIN]['Product Title'].values[0].lower()
# Get a string of all 5 bullet points
allBullets = []
currentProduct = df_Products[df_Products["ASIN (child)"] == ASIN]
for i in range (1,6):
allBullets.append(currentProduct['Bullet Point ' + str(i)].values[0])
allBulletsCombined = ' '.join(allBullets)
Afterwards we can check whether a keyword is included:
# Check if a term from missingKeywords_flattened_reduced is not in product title or bullet
termsNotFoundInListing = []
for term in missingKeywords_flattened_reduced:
if (term.lower() not in productTitle) and (term.lower() not in allBulletsCombined.lower()):
termsNotFoundInListing.append(term)
print("Missing keywords: " + str(termsNotFoundInListing))
As a result, we get the following:
Missing keywords: ['airtight', 'baking', 'bowl', 'bowls', 'clear', 'cooking', 'cups', 'essentials', 'home', 'kitchen', 'large', 'liquid', 'measure', 'measurement', 'measuring', 'mesururing', 'mixing', 'necessities', 'scales', 'serving', 'sets', 'tools', 'tupperware']
These keywords should - if appropriate - be added to the listing and with a bit of luck, the product will soon be clicked for these keywords as well. Of course, the keywords should not be used blindly, so check them for sense.
Conclusion
If you have the data, even a large product catalog can be checked for missing keywords in a few seconds. The script below was developed for this purpose. It even checks a detailed product catalog within seconds and determines which keywords still need to be added. The results are saved in an Excel file.
Here → you can download the entire notebook.
Another blogpost that may interest you: