Self Explore App part 2

14 min readMar 16, 2021

Possible solution: change the download path from the google settings.’

Remember to scape your string and have it based on your drive.

D:\\Lambda\\projects\\puppeteer_test\\dataD:\\Lambda\\projects\\puppeteer_test\\data

The script will make the folder if the folder does not exist.

The puppeteer way

const downloadBrowser = await puppeteer.launch({env: {PUPPETEER_DOWNLOAD_PATH: "D:\\Lambda\\projects\\puppeteer_test"}});

The bound to change the way

// set download location to local project pathawait page._client.send("Page.setDownloadBehavior", {behavior: "allow",downloadPath: "D:\\Lambda\\projects\\puppeteer_test",});

The custom way

chrome://settings/downloads

Change button

document.querySelector("settings-ui").shadowRoot.querySelector("#main").shadowRoot.querySelector("settings-basic-page").shadowRoot.querySelector("[section='downloads']").querySelector("settings-downloads-page").shadowRoot.querySelector("#changeDownloadsPath")

Change location

select file path
paste desire path

D:\Lambda\projects\puppeteer_test

click select path

I can do all of this in another tab before the download starts.

Can create a custom wait for file download when working with the chrome download page.

chrome://downloads/

I can wait for 1 sec and until the download progress bar stops.
If the download reaches 0 and (download MD is not equal to total MB or 0 seconds left), I download a file again, and in 3 seconds, I cancel it.

Progress bar

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details").querySelector("#progress")

Not so fast. The div goes invisible.

You can still find the document here. I have to find the parent div of #progress and notice the style of display none.

This could work

$('li[style*="display: none"]')

Working code:

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details div:nth-child(4)[style*='display: none']").hidden

Note:

when it finishes downloading, the download speed stays at 0 for 2–3 seconds.

Can stop here and receive a where to download option

Suppose it fails to download because of lack of space. Then it downloads again.

Select the element with the shadow tree without having to use the shadow root method.

What is Shadow DOM?

Using shadow DOM

An important aspect of web components is encapsulation — being able to keep the markup structure, style, and behavior…

developer.mozilla.org.

https://stenciljs.com/docs/end-to-end-testing#find-an-element-in-the-shadow-dom

part A space-separated list of the part names of the element. Part names allow CSS to select and style specific elements in a shadow tree via the ::part pseudo-element.

I cannot use it because the part needs to be set.

The: host pseudo-class, when evaluated in the context of a shadow tree, matches the shadow tree’s shadow host. In any other context, it matches nothing.

The :host() function pseudo-class has the syntax:

:host( <compound-selector> )

The using host got me null values.

Using shadow DOM

An important aspect of web components is encapsulation - being able to keep the markup structure, style, and behavior…

developer.mozilla.org.

Element.shadowRoot

The Element. shadow root read-only property represents the shadow root hosted by the element. UseElement.attachShadow()…

developer.mozilla.org

A workaround could be to use the page.evaluateHandle to target the DOM element.

page.waitForSelector(selector[, options])

There is no selector option. I cannot use this when evaluating.

await settingPage.waitForNavigation({ timeout: 0, waitUntil: "networkidle0"})

Waits indefinitely or until time out.

setTimeout()

best to clearTimeout() when done using the timeout.

Remember to catch id by setting the setTimeout to a variable

setInterval()

have to use ClearInterval to stop the repeat of the code or unscoped it.

Remember to catch id by setting the setInterval() to a variable

Optimization hints

jQuery - Trigger event when an element is removed from the DOM

Thanks for contributing an answer to Stack Overflow! Please be sure to answer the question. Provide details and share…

stackoverflow.com

GlobalEventHandlers.onchange

The property of the GlobalEventHandlers mixin is an EventHandler for processing events. change events fire when the…

developer.mozilla.org

It takes 52 sec to 1 minute for the download to timeout.
Is there a setInterval like the function that can wait for an event to be fire? Maybe a chrome dev tools event listener breakpoints that I can use with puppeteer that lets me know when a file is downloading, stop downloading, and finish downloading.

Progress so far

Custom file waiter is working.

It needs a way to deal with a Facebook network error.
It can be optimized by making the code wait for downloading of the file to be finished.
It needs to be tested with the other scrips.

Success

Need to do before moving to electron integration

does it work headless?
Network issues need to be solved automatically.

Network issues need to be solved automatically.

Cancel button

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details").querySelector("#safe").querySelectorAll("cr-button")[1]

Picking a download folder

They are ordered by row index. The newest will always be number 1, and the oldest will be the total of all download files. This example is 3. The example below shows how to select the 2nd downloaded folder.

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item[aria-rowindex='2']")

Remove option

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#remove")

description

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details").querySelector("#description").innerText
"0 B/s - 248 MB of 694 MB, Paused"

Download speed

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details").querySelector("#description").innerText.split(" ")[0]"0"

Current Data

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details").querySelector("#description").innerText.split(" ")[3]"112"

Total data

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details").querySelector("#description").innerText.split(" ")[6]"694"

Conver the string to number

Number( )

Number(document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details").querySelector("#description").innerText.split(" ")[3])

Number

The constructor contains constants and methods for working with numbers. Values of other types can be converted to…

developer.mozilla.org.

The contrast of finish download description and still working on it

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details").querySelector("#description").hiddendocument.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details").querySelector("#description").hidden

Interesting

The Location.reload() method reloads the current URL, like the Refresh button.

JavaScript Date objects represent a single moment in time in a platform-independent format.

Date().split(" ")[4]

Can you overwrite same name files in chromium?

Node Debugger does not work with Puppeteer evaluator

Solutions:

wrap the evaluator DOM element into a variable and use puppeteer instead of the evaluator
create a call back that loops around until a successful download of the file are made
redownload the document on the Dev console

The document could be redownloaded by clicking the downloading URL

document.querySelector("downloads-manager").shadowRoot.querySelector("#mainContainer").querySelector("#downloadsList").querySelector("downloads-item").shadowRoot.querySelector("#details").querySelector("#url")

Auto network fix is a success.

Start dev console the console tab.

No easy way to do so

Refactor the script

Get all functions into their own file

loginToFacebook
createData
waitForData
downloadData

New functions names

loginToFacebook
createFile
waitForFile
downloadFile

Understanding module. exports and exports in Node.js

James shows how to work with modules in Node.js, focusing on how to export and consume them, and explaining the…

www.sitepoint.com

Others

research

const fs = require("fs");

Twitter Data Archive Automation

Current Script Hierarchy

Testing for a breakthrough:

Fix frame detach

frame detachment could have been because when I reloaded the page, I didn’t wait for the frame to load its elements back up

Simplify waitForFileDownload

can be simplified the evaluation by using it to create a JSHandler of the variables and extract a boolean value.

This opens up the reuse of start file download if they ask the user to reenter their credentials and takes away the custom waiting functions.
Allows for console logs to be on the script instead of on the headless chromium-browser

Can we wait for navigation use load instead of waiting for all network activities to die down?

Puppeteer works fine in dev mode but breaks in headless mode.

Chrome download is not accessible in headless mode.

chrome://downloads/

chrome://downloads/ can be the part of the GUI that headless ( command line mode) does not load.

Solution: I need a command that updates me on the current download status

class: HTTPRequest

puppeteer github docs feels more complete than their website doc.

httpRequest.url()

returns: <string> URL of the request.

httpRequest.headers()

returns: <Object> An object with HTTP headers associated with the request. All header names are lower-case.
MDN

Useful headers

request URL
Status Code

Where can I find the download speed?

page.waitForRequest(urlOrPredicate[, options])

capture the request URL of the get 200 status
wait for the request URL with a timeout of 0

Research and development

Playwright makes everything I want to do with Puppeteer easy period.

Wait for the file solution.

Downloads | Playwright

For every attachment downloaded by the page, page. on('download') event is emitted. If you create a browser context with…

playwright.dev

Handle authentication gracefully

Authentication | Playwright

Playwright can be used to automate scenarios that require authentication. Tests are written with Playwright execute in…

playwright.dev

The major component for fixing network error.

Download | Playwright

Browser context must be created with the accept downloads set to true when the user needs access to the downloaded content…

playwright.dev

Node.js v15.9.0 Documentation

Almost all Node.js applications, no matter how simple, use streams in some manner. The following is an example of using…

nodejs.org

Useful yarn commands

yarn add <package...> [--dev/-D]

Using --dev or -D will install one or more packages in your devDependencies.

yarn run [script] [<args>]

`Create React app on Yarn`

yarn create <starter-kit-package> [<args>]

This command is a shorthand that helps you do two things at once:

Install create-<starter-kit-package> globally, or update the package to the latest version of it already exists
Run the executable located in the bin field of the starter kit’s package.json, forwarding any <args> to it

For example, yarn create react-app my-app is equivalent to:

$ yarn global add create-react-app
$ create-react-app my-app

If you have defined a scripts object in your package, this command will run the specified [script]. For example:

Start package.json

yarn init

This command walks you through an interactive session to create a package.json file. Some defaults such as the license and initial version are found in yarn’s init-* config settings.

Rebuilding Download file script on playWright

document.querySelector("iframe")

Waiting for the iframe element

Sometimes Facebook adds a iframe on the login page

waiting for navigation load

waiting for navigation `domcontentloaded`

wait for navigation network idle

Console.log await download.createReadStream()

Error

[Error: EPERM: operation not permitted,
unlink 'C:\Users\Owner\AppData\Local\Temp\playwright_chromiumdev_profile-3XZLP0\CrashpadMetrics-active.pma'] {
  errno: -4048,
  code: 'EPERM',
  syscall: 'unlink',
  path: 'C:\\Users\\Owner\\AppData\\Local\\Temp\\playwright_chromiumdev_profile-3XZLP0\\CrashpadMetrics-active.pma'

After some time, the network idle became like this.

Page | Playwright

The page provides methods to interact with a single tab in a Browser or an extension background page in Chromium. One…

playwright.dev

How to change the name of a download file

const { chromium } = require('playwright');(async () => {const browser = await chromium.launch({args: ["--start-maximized", "--disable-notifications",  '--disable-extensions', '--mute-audio'],devtools: true,slowMo: 50,defaultViewport: null,downloadsPath: "D:\\Lambda\\projects\\puppeteer_test\\data",});const page = await browser.newPage({acceptDownloads: true,viewport: null,});;await page.goto("https://www.writeaplay.co.uk/cms/wp-content/uploads/2015/04/The-Progressive-Playwright-1080x380.jpg");page.on('download', async download => {await download.saveAs("./data/img.png")});//   await browser.close();})();

The problem is that it 2 downloads.

Solution

page.on('download', download => {
// save the download file has the suggested file namedownload.saveAs(`./data/${ download.suggestedFilename()}`)// delete the criptic file name
download.delete()});

If the same file is redownloaded it will be overwritten.

Weir behavior recorded

I notice that it didn’t work for the first 3 downloads for a large image, but after that, it did. Also, it works for videos, and it works fine.

Error: EPERM: operation not permitted, unlink

[Error: EPERM: operation not permitted, unlink 'C:\Users\Owner\AppData\Local\Temp\playwright_chromiumdev_profile-lE48xW\CrashpadMetrics-active.pma'] {
  errno: -4048,
  code: 'EPERM',
  syscall: 'unlink',
  path: 'C:\\Users\\Owner\\AppData\\Local\\Temp\\playwright_chromiumdev_profile-lE48xW\\CrashpadMetrics-active.pma'
}

To solve it, you have to:

1) add your project folder to the Windows Defender exclusions list.

2) remove your node_modules folder

3) run:

npm install

npm ERR! Error: EPERM: operation not permitted, unlink

I'm working on an expo application. I want to install all the packages on my expo application (npm install). But…

stackoverflow.com

How can I tell the user how to progress is going?

What time it started downloading.
Current download speed
Current data — Max data — current time

I don’t know how to get the above live data, but I can do an interval console.log for every 1/2 a minute to say that it is being downloaded.

1/2 a minute download reports

Downloading more than 1 file will keep the report coming until the program is stop

setInterval()

Things to do

- Network fix needs to be implemented

- File report bug needs fixing

Downloading more than 1 file will keep the report coming until the program is stop

— Move the whole scrip to Playwright

- Add Reuse authentication state solution

Convert Puppeteer scripts to Playwright

Issue

That said, it still works on dev tools.

Solution

The error was due to all files expiring. The solution is to get a new file.

Polishing and debugging script for v1 integration

handle the case when the get file button is disabled

document.querySelector("iframe").contentDocument.querySelector("button[aria-disabled='false'")

The cancel button also appears, so I need a more specific identifier path.

$x("//button//div[text()='Create File']/../..")[0]

When the button should have been true, it was false. Does this mean that the doc var data is a snipe of a previous state?

Frame detaching after reloading the page

Why does frame selection breaks after page reloads?

I am trying to inspect a page with playwright that holds a frame document that when I click a button, a banner will…

stackoverflow.com

Solution for download issues

Request | Playwright

Whenever the page sends a request for a network resource, the following sequence of events are emitted by Page: If…

playwright.dev

Download | Playwright

Browser context must be created with the accept downloads set to true when the user needs access to the downloaded content…

playwright.dev

Error

Solution

Speed up wait for download event to fail

I will need to wait for the input selector to fail then wait until the event time out

even with the error, the script continues and stops after the file has been downloaded.

Why do I have a bunch of chromium tasks running when the browser is closing correctly?

Possible cause: When the scrip fails and it does not get to browser.close it still takes CPU resources. Only in headful mode

Error: fill(“input[type=password]”) timeout

The download button broke. You click it, and nothing happens. No download or anything, but if you reload the page, it works again.

Error: not constant error

use this for Twitter

Command Line Interface | Playwright

Playwright comes with the command line tools that run via npx or as a part of the npm scripts.

playwright.dev

Moving to electron integration.

A button that runs a script

script tag
on click event

Other resources:

Use these Medium keyboard shortcuts like a Pro [Checklist]

Self Explore App part 2

Possible solution: change the download path from the google settings.’

The puppeteer way

The bound to change the way

The custom way

Change button

Can create a custom wait for file download when working with the chrome download page.

Progress bar

Note:

What is Shadow DOM?

Using shadow DOM

An important aspect of web components is encapsulation — being able to keep the markup structure, style, and behavior…

Using shadow DOM

An important aspect of web components is encapsulation - being able to keep the markup structure, style, and behavior…

Element.shadowRoot

The Element. shadow root read-only property represents the shadow root hosted by the element. UseElement.attachShadow()…

Optimization hints

jQuery - Trigger event when an element is removed from the DOM

Thanks for contributing an answer to Stack Overflow! Please be sure to answer the question. Provide details and share…

GlobalEventHandlers.onchange

The property of the GlobalEventHandlers mixin is an EventHandler for processing events. change events fire when the…

Progress so far

Success

Network issues need to be solved automatically.

Cancel button

Picking a download folder

Remove option

description

Number

The constructor contains constants and methods for working with numbers. Values of other types can be converted to…

Interesting

Node Debugger does not work with Puppeteer evaluator

Start dev console the console tab.

Refactor the script

Understanding module. exports and exports in Node.js

James shows how to work with modules in Node.js, focusing on how to export and consume them, and explaining the…

Others

Twitter Data Archive Automation

Testing for a breakthrough:

Fix frame detach

Simplify waitForFileDownload

Can we wait for navigation use load instead of waiting for all network activities to die down?

Puppeteer works fine in dev mode but breaks in headless mode.

page.waitForRequest(urlOrPredicate[, options])

Playwright makes everything I want to do with Puppeteer easy period.

Wait for the file solution.

Downloads | Playwright

For every attachment downloaded by the page, page. on('download') event is emitted. If you create a browser context with…

Handle authentication gracefully

Authentication | Playwright

Playwright can be used to automate scenarios that require authentication. Tests are written with Playwright execute in…

The major component for fixing network error.

Download | Playwright

Browser context must be created with the accept downloads set to true when the user needs access to the downloaded content…

Node.js v15.9.0 Documentation

Almost all Node.js applications, no matter how simple, use streams in some manner. The following is an example of using…

Useful yarn commands

Start package.json

Rebuilding Download file script on playWright

Waiting for the iframe element

waiting for navigation load

waiting for navigation domcontentloaded

wait for navigation network idle

Page | Playwright

The page provides methods to interact with a single tab in a Browser or an extension background page in Chromium. One…

How to change the name of a download file

Solution

Weir behavior recorded

Error: EPERM: operation not permitted, unlink

To solve it, you have to:

npm ERR! Error: EPERM: operation not permitted, unlink

I'm working on an expo application. I want to install all the packages on my expo application (npm install). But…

How can I tell the user how to progress is going?

1/2 a minute download reports

Things to do

- Network fix needs to be implemented

- File report bug needs fixing

— Move the whole scrip to Playwright

- Add Reuse authentication state solution

Convert Puppeteer scripts to Playwright

waiting for navigation `domcontentloaded`