Pandas error in Python: columns must be same length as key

You need a bit modify solution, because sometimes it return 2 and sometimes only one column: df2 = pd.DataFrame({‘STATUS’:[‘Estimated 3:17 PM’,’Delayed 3:00 PM’]}) df3 = df2[‘STATUS’].str.split(n=1, expand=True) df3.columns = [‘STATUS_ID{}’.format(x+1) for x in df3.columns] print (df3) STATUS_ID1 STATUS_ID2 0 Estimated 3:17 PM 1 Delayed 3:00 PM df2 = df2.join(df3) print (df2) STATUS STATUS_ID1 STATUS_ID2 0 … Read more

BeautifulSoup: what’s the difference between ‘lxml’ and ‘html.parser’ and ‘html5lib’ parsers?

From the docs‘s summarized table of advantages and disadvantages: html.parser – BeautifulSoup(markup, “html.parser”) Advantages: Batteries included, Decent speed, Lenient (as of Python 2.7.3 and 3.2.) Disadvantages: Not very lenient (before Python 2.7.3 or 3.2.2) lxml – BeautifulSoup(markup, “lxml”) Advantages: Very fast, Lenient Disadvantages: External C dependency html5lib – BeautifulSoup(markup, “html5lib”) Advantages: Extremely lenient, Parses pages … Read more

How do I avoid HTTP error 403 when web scraping with Python?

This is probably because of mod_security or some similar server security feature which blocks known spider/bot user agents (urllib uses something like python urllib/3.3.0, it’s easily detected). Try setting a known browser user agent with: from urllib.request import Request, urlopen req = Request( url=”http://www.cmegroup.com/trading/products/#sortField=oi&sortAsc=false&venues=3&page=1&cleared=1&group=1″, headers={‘User-Agent’: ‘Mozilla/5.0’} ) webpage = urlopen(req).read() This works for me. By … Read more

Jsoup Cookies for HTTPS scraping

I know I’m kinda late by 10 months here. But a good option using Jsoup is to use this easy peasy piece of code: //This will get you the response. Response res = Jsoup .connect(“url”) .data(“loginField”, “[email protected]”, “passField”, “pass1234”) .method(Method.POST) .execute(); //This will get you cookies Map<String, String> cookies = res.cookies(); //And this is the … Read more

How to get all html data after all scripts and page loading is done? (puppeteer)

If you want full html same as inspect? Here it is: const puppeteer = require(‘puppeteer’); (async function main() { try { const browser = await puppeteer.launch(); const [page] = await browser.pages(); await page.goto(‘https://example.org/’, { waitUntil: ‘networkidle0’ }); const data = await page.evaluate(() => document.querySelector(‘*’).outerHTML); console.log(data); await browser.close(); } catch (err) { console.error(err); } })();

Can a Telegram bot read messages of channel

The FAQ reads: All bots, regardless of settings, will receive: All service messages. All messages from private chats with users. All messages from channels where they are a member. Bot admins and bots with privacy mode disabled will receive all messages except messages sent by other bots. Bots with privacy mode enabled will receive: Commands … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)