1

I am trying to scrape formularylookup.com, a site with information on the market for pharmaceuticals.

It requires a login: username: - password: -

I need the information for the medicine called Rybelsus.

When I look into the Inspect-> Network -> XHR I suspect there could be an easy way to get the required data form this page:

https://formularylookup.com/Formulary/Coverage?ProductId=237171&ProductName=Rybelsus&ChannelId=1&DrugTypeId=3&StateId=all&Options=SummaryCoverages

I identified this site, which might give an idea of how to connect to formularylookup.com, but I am very inexperienced with connecting to API's.

Here's my code:

import requests
from bs4 import BeautifulSoup


url ="https://api.mmitnetwork.com/Formulary/v1/Products?Name=rybelsus"
params = {
        "ProductId":"237171",
        "productSearch":"Rybelsus"}


headers = {
        "authorization":"Bearer H-oa4ULGls2Cpu8U6hX4myixRoFIPxfj",
        "Access-Token":"H-oa4ULGls2Cpu8U6hX4myixRoFIPxfj",
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36",
        "X-Requested-With": "XMLHttpRequest",
        "Host": "formularylookup.com",
        "X-NewRelic-ID": "XAYCVFZSGwcGU1lXBAI="
        }


res = requests.get(url ,params=params ,headers = headers)
soup = BeautifulSoup(res.content, "lxml")
print(soup.prettify())

Which gives me the following response:

<!DOCTYPE html>
<html>
 <head>
  <title>
   The resource cannot be found.
  </title>
  <meta content="width=device-width" name="viewport"/>
  <style>
   body {font-family:"Verdana";font-weight:normal;font-size: .7em;color:black;} 
         p {font-family:"Verdana";font-weight:normal;color:black;margin-top: -5px}
         b {font-family:"Verdana";font-weight:bold;color:black;margin-top: -5px}
         H1 { font-family:"Verdana";font-weight:normal;font-size:18pt;color:red }
         H2 { font-family:"Verdana";font-weight:normal;font-size:14pt;color:maroon }
         pre {font-family:"Consolas","Lucida Console",Monospace;font-size:11pt;margin:0;padding:0.5em;line-height:14pt}
         .marker {font-weight: bold; color: black;text-decoration: none;}
         .version {color: gray;}
         .error {margin-bottom: 10px;}
         .expandable { text-decoration:underline; font-weight:bold; color:navy; cursor:hand; }
         @media screen and (max-width: 639px) {
          pre { width: 440px; overflow: auto; white-space: pre-wrap; word-wrap: break-word; }
         }
         @media screen and (max-width: 479px) {
          pre { width: 280px; }
         }
  </style>
 </head>
 <body bgcolor="white">
  <span>
   <h1>
    Server Error in '/' Application.
    <hr color="silver" size="1" width="100%"/>
   </h1>
   <h2>
    <i>
     The resource cannot be found.
    </i>
   </h2>
  </span>
  <font face="Arial, Helvetica, Geneva, SunSans-Regular, sans-serif ">
   <b>
    Description:
   </b>
   HTTP 404. The resource you are looking for (or one of its dependencies) could have been removed, had its name changed, or is temporarily unavailable.  Please review the following URL and make sure that it is spelled correctly.
   <br/>
   <br/>
   <b>
    Requested URL:
   </b>
   /Formulary/v1/Products
   <br/>
   <br/>
   <hr color="silver" size="1" width="100%"/>
   <b>
    Version Information:
   </b>
   Microsoft .NET Framework Version:4.0.30319; ASP.NET Version:4.6.1590.0
  </font>
 </body>
</html>
<!-- 
[HttpException]: The controller for path &#39;/Formulary/v1/Products&#39; was not found or does not implement IController.
   at System.Web.Mvc.DefaultControllerFactory.GetControllerInstance(RequestContext requestContext, Type controllerType)
   at System.Web.Mvc.DefaultControllerFactory.CreateController(RequestContext requestContext, String controllerName)
   at System.Web.Mvc.MvcHandler.ProcessRequestInit(HttpContextBase httpContext, IController& controller, IControllerFactory& factory)
   at System.Web.Mvc.MvcHandler.BeginProcessRequest(HttpContextBase httpContext, AsyncCallback callback, Object state)
   at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
   at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)
-->
<!-- 
This error page might contain sensitive information because ASP.NET is configured to show verbose error messages using &lt;customErrors mode="Off"/&gt;. Consider using &lt;customErrors mode="On"/&gt; or &lt;customErrors mode="RemoteOnly"/&gt; in production environments.-->

Update: I get an 404 error. Not sure why.

1
  • response is 404, not sure why Commented Feb 1, 2020 at 14:44

1 Answer 1

2

Below code will help you,

import requests

headers = {
    'Accept': '*/*',
    'X-Requested-With': 'XMLHttpRequest',
    'Access-Token': '7Lq-KkDx2fCO_3kG90pLEpBS9Ssh62IQ',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36',
    'Is-Session-Expired': 'false',
    'Referer': 'https://formularylookup.com/',
}

response = requests.get('https://formularylookup.com/Formulary/Coverage?ProductId=237171&ProductName=Rybelsus&ChannelId=1&DrugTypeId=3&StateId=AL&Options=SummaryCoverages', headers=headers)

print(response.json())

Note: 'Is-Session-Expired': 'false' is very important in the header otherwise you'll get 404 error.

See it in action here

Sign up to request clarification or add additional context in comments.

4 Comments

I think I've been working on this problem for two full working days. Initially I tried the API, but couldn't make it work. Then I tried a huge ass code with selenium and BS4, but eventually got stuck, so I turned to the API again. Now I have a better idea of how it works. Thank you! Now to turn this into a pandas dataframe, should be easy.
Thank you so much! I swear I always underestimate how helpful some people on SO are :'(
@doomdaam Glad to hear!
I posted a new question after playing around if you're interested: stackoverflow.com/questions/60019043/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.