Welcome to our second video in the series! In this video, we'll show you how to spice up your data pipeline by using variables and expressions. By making a few changes to the pipeline we created in the previous video, we can dynamically insert values, loop through lists, and enhance the overall functionality of our pipeline.
Creating a Variable for Google Scholar Author IDs
The first change we need to make is to create a variable to hold the list of Google Scholar author IDs that we want to download. This will allow us to easily modify and update the list without having to manually change the pipeline each time. To do this, follow these steps:
Copy the author ID from the relative URL that was hard-coded into the activity.
Click on the canvas area and go to the variables tab.
Click on "New" and give your variable a name, such as "authorID."
Paste the copied value into the variable.
Dynamically Inserting the Variable into the Relative URL Field
Now that we have our variable, we can dynamically insert its value into the relative URL field. Think Microsoft Word merge document! It's just like that! This eventually will allow us to merge in the author ID dynamically, which will come in handy when dealing with multiple authors! To accomplish this, follow these steps:
Go back to the copy data activity.
Instead of hard-coding the relative URL, we'll use a formula to dynamically create it.
Use the "concat" function to concatenate the different parts of the URL.
Insert the value of the authorID variable into the URL.
After making these changes, your relative URL field should be updated with the formula. This ultimately enables our pipeline to fetch data for multiple authors in a list which we will cover in the next video!
Testing and Verification
Before running the pipeline, it's crucial to test and verify that everything is functioning as expected. Here's what you need to do:
Create another variable for the API key and pass it in with the author ID using similar steps as above. This will eventually allow for more security down the road and to securely pass our API key to the pipeline instead of hardcoding it in the pipeline.
Test the pipeline by clicking on "Run."
Wait for the pipeline to complete its execution.
Navigate to your Google Scholar lake house and verify if the data has been successfully saved.
Click on Google Scholar on the left-hand side and navigate to your bronze destination.
Locate the file and open it to ensure that the data has been retrieved correctly.
If everything looks good, congratulations! Your variables and expressions are working as intended.
In this video, we explore how to spice up your data pipeline by using variables and expressions. By creating a variable for your author ID and dynamically inserting values into the relative URL field you created a more flexible pipeline that allows you to download different authors or even loop through a list of author IDs. Using the variables and expressions can greatly enhance the functionality and flexibility of your pipeline.
In the next video, we'll dive deeper into creating a list of authors and looping through them. Stay tuned for more exciting tips and tricks to level up your data factory skills. Until then, happy data wrangling!