InternetPirate

@InternetPirate@lemmy.fmhy.ml

This profile is from a federated server and may be incomplete. Browse more on the original instance.

Tingling and palpitations spreading throughout the body

The tingling and palpitations that have been spreading throughout the body can be caused by several conditions. Here are some possibilities based on the search results:...

China Issues Rules for Generative AI, Mandating Adherence to 'Socialist Values' (article from 14.07.2023) (www.maginative.com)

How to download a website for offline browsing using wget?

To download a website for offline browsing using wget, you can use the following command:...

InternetPirate, 10 months ago
wget -mkEpnp

wget --mirror --convert-links --adjust-extension --page-requisites –no-parent example.org

Explanation of the various flags:
<pre style="background-color:#ffffff;">
--mirror – Makes (among other things) the download recursive.
--convert-links – convert all the links (also to stuff like CSS stylesheets) to relative, so it will be suitable for offline viewing.
--adjust-extension – Adds suitable extensions to filenames (html or css) depending on their content-type.
--page-requisites – Download things like CSS style-sheets and images required to properly display the page offline.
--no-parent – When recursing do not ascend to the parent directory. It useful for restricting the download to only a portion of the site.

wget -mpHkKEb -t 1 -e robots=off -U ‘Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0’ www.example.com
<pre style="background-color:#ffffff;">
–m (--mirror) : turn on options suitable for mirroring (infinite recursive download and timestamps).

-p (--page-requisites) : download all files that are necessary to properly display a given HTML page. This includes such things as inlined images, sounds, and referenced stylesheets.

-H (--span-hosts): enable spanning across hosts when doing recursive retrieving.

–k (--convert-links) : after the download, convert the links in document for local viewing.

-K (--backup-converted) : when converting a file, back up the original version with a .orig suffix. Affects the behavior of -N.

-E (--adjust-extension) : add the proper extension to the end of the file.

-b (--background) : go to background immediately after startup. If no output file is specified via the -o, output is redirected to wget-log.

-e (--execute) : execute command (robots=off).

-t number (--tries=number) : set number of tries to number.

-U (--user-agent) : identify as agent-string to the HTTP server. Some servers may ban you permanently for recursively download if you send the default User Agent.

Cronjobs

0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk -t 1 -e robots=off -U ‘Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0’ example.com

0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk -t 1 -e robots=off -U ‘Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0’ example.com

0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U ‘Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0’ example.com

0 23 * * * cd ~/Documents/Webs/mirror; wget -mpkH t 1 -e robots=off -U ‘Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0’ -D example.com

0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U ‘Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0’ example.com

0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U ‘Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0’ example.com

0 23 * 1 * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U ‘Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0’ example.com

0 8 * * * pkill wget; cd ~/Documents/Webs/mirror/; rm wget*
reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

InternetPirate, 10 months ago

what are the best linux cli tools to download a mirror of a website for offline browsing?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

InternetPirate, 10 months ago
There are several Linux command-line tools that can be used to download a mirror of a website for offline browsing. Here are some of the most popular ones:
wget: wget is a free utility for non-interactive download of files from the Web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies. To download a website recursively, use the -r option. For example, the following command will download the entire website located at http://example.com:
<pre style="background-color:#ffffff;">
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://example.com

This will create a local copy of the website in a directory named example.com.
httrack: httrack is a free and open-source website copier that allows you to download a website and browse it offline. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies. To download a website, use the -O option followed by the directory where you want to save the website. For example, the following command will download the entire website located at http://example.com:
<pre style="background-color:#ffffff;">
httrack http://example.com -O /path/to/save/directory

This will create a local copy of the website in the directory /path/to/save/directory/example.com.
curl: curl is a command-line tool for transferring data from or to a server, using one of the supported protocols (HTTP, HTTPS, FTP, etc.). To download a website recursively, use the -r option. For example, the following command will download the entire website located at http://example.com:
<pre style="background-color:#ffffff;">
curl -k -L -O -J -R -H 'Referer: http://example.com' -e 'http://example.com' -A 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36' -m 1800 --retry 3 --retry-delay 5 --retry-max-time 60 --retry-connrefused -s -S -L -r -k -J -O -C - http://example.com

This will create a local copy of the website in the current directory.
These tools have different options and features, so you should choose the one that best suits your needs.

Citations: [1] tutorialspoint.com/5-linux-command-line-based-too…[2] www.kali.org/tools/httrack/[3] medevel.com/os1-3-web-copier/[4] linux-magazine.com/…/WebHTTrack-Website-Copier[5] winaero.com/make-offline-copy-of-a-site-with-wget…[6] alvinalexander.com/…/how-to-make-offline-mirror-c…
reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Mind blown 🤯🤯 (libreddit.pussthecat.org)

Bing (multimodal) image input is free!

Post title character limit is 201

I couldn’t find this using the search.

limit python requests to 1.39 requests per second

To limit your Python requests to 1.39 requests per second, you can use the time.sleep() function from the time module to introduce a delay between requests. You can also use third-party libraries like pyrate-limiter, requests-ratelimiter, or ratelimiter to achieve rate limiting....

InternetPirate, 10 months ago

To achieve a rate of 1.39 requests per second, you can use a global variable to keep track of the time elapsed between requests and then calculate the delay based on that time. Here’s a modified version of your fetch_github_data function that implements this:

<pre style="background-color:#ffffff;">
<span style="font-weight:bold;color:#a71d5d;">import </span><span style="color:#323232;">time
</span><span style="font-weight:bold;color:#a71d5d;">import </span><span style="color:#323232;">requests
</span><span style="font-weight:bold;color:#a71d5d;">import </span><span style="color:#323232;">logging
</span><span style="color:#323232;">
</span><span style="color:#0086b3;">PERSONAL_ACCESS_TOKEN </span><span style="font-weight:bold;color:#a71d5d;">= </span><span style="color:#183691;">"your_personal_access_token"
</span><span style="color:#0086b3;">DELAY </span><span style="font-weight:bold;color:#a71d5d;">= </span><span style="color:#0086b3;">1 </span><span style="font-weight:bold;color:#a71d5d;">/ </span><span style="color:#0086b3;">1.39  </span><span style="font-style:italic;color:#969896;"># Calculate the delay for 1.39 requests per second
</span><span style="color:#323232;">last_request_time </span><span style="font-weight:bold;color:#a71d5d;">= </span><span style="color:#0086b3;">0  </span><span style="font-style:italic;color:#969896;"># Initialize a global variable to store the last request time
</span><span style="color:#323232;">
</span><span style="color:#323232;">
</span><span style="font-weight:bold;color:#a71d5d;">def </span><span style="font-weight:bold;color:#323232;">fetch_github_data</span><span style="color:#323232;">(url):
</span><span style="color:#323232;">    </span><span style="font-weight:bold;color:#a71d5d;">global </span><span style="color:#323232;">last_request_time  </span><span style="font-style:italic;color:#969896;"># Access the global variable
</span><span style="color:#323232;">
</span><span style="color:#323232;">    </span><span style="font-weight:bold;color:#a71d5d;">try</span><span style="color:#323232;">:
</span><span style="color:#323232;">        headers </span><span style="font-weight:bold;color:#a71d5d;">= </span><span style="color:#323232;">{
</span><span style="color:#323232;">            </span><span style="color:#183691;">"Accept"</span><span style="color:#323232;">: </span><span style="color:#183691;">"application/vnd.github+json"</span><span style="color:#323232;">,
</span><span style="color:#323232;">            </span><span style="color:#183691;">"Authorization"</span><span style="color:#323232;">: </span><span style="font-weight:bold;color:#a71d5d;">f</span><span style="color:#183691;">"Bearer </span><span style="color:#323232;">{</span><span style="color:#0086b3;">PERSONAL_ACCESS_TOKEN</span><span style="color:#323232;">}</span><span style="color:#183691;">"</span><span style="color:#323232;">,
</span><span style="color:#323232;">            </span><span style="color:#183691;">"X-GitHub-Api-Version"</span><span style="color:#323232;">: </span><span style="color:#183691;">"2022-11-28"</span><span style="color:#323232;">,
</span><span style="color:#323232;">        }
</span><span style="color:#323232;">
</span><span style="color:#323232;">        </span><span style="font-style:italic;color:#969896;"># Calculate the time elapsed since the last request
</span><span style="color:#323232;">        time_elapsed </span><span style="font-weight:bold;color:#a71d5d;">= </span><span style="color:#323232;">time.time() </span><span style="font-weight:bold;color:#a71d5d;">- </span><span style="color:#323232;">last_request_time
</span><span style="color:#323232;">
</span><span style="color:#323232;">        </span><span style="font-style:italic;color:#969896;"># Calculate the required delay based on the time elapsed
</span><span style="color:#323232;">        required_delay </span><span style="font-weight:bold;color:#a71d5d;">= </span><span style="color:#62a35c;">max</span><span style="color:#323232;">(</span><span style="color:#0086b3;">0</span><span style="color:#323232;">, </span><span style="color:#0086b3;">DELAY </span><span style="font-weight:bold;color:#a71d5d;">- </span><span style="color:#323232;">time_elapsed)
</span><span style="color:#323232;">
</span><span style="color:#323232;">        </span><span style="font-style:italic;color:#969896;"># Sleep for the required delay
</span><span style="color:#323232;">        time.sleep(required_delay)
</span><span style="color:#323232;">
</span><span style="color:#323232;">        response </span><span style="font-weight:bold;color:#a71d5d;">= </span><span style="color:#323232;">requests.get(url, headers</span><span style="font-weight:bold;color:#a71d5d;">=</span><span style="color:#323232;">headers)
</span><span style="color:#323232;">
</span><span style="color:#323232;">        </span><span style="font-style:italic;color:#969896;"># Update the last request time
</span><span style="color:#323232;">        last_request_time </span><span style="font-weight:bold;color:#a71d5d;">= </span><span style="color:#323232;">time.time()
</span><span style="color:#323232;">
</span><span style="color:#323232;">        logging.info(</span><span style="font-weight:bold;color:#a71d5d;">f</span><span style="color:#183691;">"Fetched data from </span><span style="color:#323232;">{url}</span><span style="color:#183691;">"</span><span style="color:#323232;">)
</span><span style="color:#323232;">        </span><span style="font-weight:bold;color:#a71d5d;">return </span><span style="color:#323232;">response.json()
</span><span style="color:#323232;">    </span><span style="font-weight:bold;color:#a71d5d;">except </span><span style="color:#323232;">requests.exceptions.RequestException </span><span style="font-weight:bold;color:#a71d5d;">as </span><span style="color:#323232;">e:
</span><span style="color:#323232;">        logging.exception(</span><span style="font-weight:bold;color:#a71d5d;">f</span><span style="color:#183691;">"Error fetching data from </span><span style="color:#323232;">{url}</span><span style="color:#0086b3;">n</span><span style="color:#323232;">{e}</span><span style="color:#183691;">"</span><span style="color:#323232;">)
</span><span style="color:#323232;">        </span><span style="font-weight:bold;color:#a71d5d;">raise
</span>

This code calculates the required delay based on the desired rate of 1.39 requests per second and the time elapsed since the last request. It then sleeps for the required delay before making the next request. The global variable last_request_time is used to keep track of the time of the last request.

Citations: [1] geeksforgeeks.org/how-to-add-time-delay-in-python…[2] stackoverflow.com/…/calculate-attempts-per-second[3] pypi.org/project/requests-ratelimiter/[4] …github.io/Rate-Limiting-With-Python/[5] stackoverflow.com/…/are-global-variables-thread-s…[6] stackoverflow.com/…/python-request-get-after-few-…[7] realpython.com/python-sleep/[8] algotrading101.com/…/yahoo-finance-api-guide/[9] stackoverflow.com/…/limiting-number-of-http-reque…[10] realpython.com/python-use-global-variable-in-func…[11] scrapeops.io/…/scrapy-delay-between-requests/[12] cloud.google.com/python/docs/reference/…/client[13] github.com/JWCook/requests-ratelimiter[14] discuss.python.org/t/…/16833[15] coderslegacy.com/…/delay-between-requests-in-scra…[16] jrnold.github.io/…/transform.html[17] …gitconnected.com/implement-rate-limiting-in-pyth…[18] docs.python.org/3/faq/programming.html[19] javatpoint.com/how-to-add-time-delay-in-python[20] koji.mbox.centos.org/koji/buildinfo?buildID=22406[21] pypi.org/project/ratelimit/[22] docs.python.org/3/library/timeit.html[23] purplefrogsystems.com/…/how-to-delay-a-python-loo…[24] medium.com/…/conquering-api-rate-limiting-dcac555…[25] learning.postman.com/docs/…/pre-request-scripts/[26] python-forum.io/thread-35631.html[27] 365datascience.com/…/limit-rate-requests-web-scra…[28] instructobit.com/…/How-to-share-global-variables-…[29] learn.microsoft.com/…/storage-blobs-latency[30] dev.to/…/rate-limiting-using-python-and-redis-58g…[31] www.simplilearn.com/…/global-variable-in-python

InternetPirate, 10 months ago

It isn’t chatgpt. It’s an LLM with search

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

How to avoid defining sqlite3 conn and cursor in many functions across the code?

To avoid defining the sqlite3.Connection and sqlite3.Cursor in many functions throughout your code, you can use the following approaches:...

Are AI-Generated Solutions Welcomed on programming.dev?

I wanted to start a discussion about the use of AI-generated solutions on Programming.dev. Personally, I’ve found that AI-powered tools have been incredibly helpful in solving programming questions. I won’t name any specific commercial software, but I use one that combines GPT-4 and web search to get more factual...

[Solved] How would you debug this script without creating many posts?

I won’t know if this script works until I run it and see the errors but the comments won’t start to generate until after all the posts so I can’t debug that part until I’ve already created too much content....

InternetPirate, 10 months ago (edited 10 months ago)

Just change lemmy.post.create to lemmy.post.createe to trigger an AttributeError. That way you can debug the code without creating any posts. You can also use many print statements all around the code, I would use two for each line to make sure the computer isn’t fooling you. Lastly, you can spin up your own Lemmy instance to not have to worry about the generated posts.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

How to see the feed of another instance?

I’m wondering if it’s possible to see the local feed of another instance from the one I’m using. I’m interested in exploring content from other instances without having to visit every single community, but I’m not sure how to do it. I’ve tried searching for a way to do this on the documentation and using the Lemmy...

Does commenting on your own post bump it on the active filter view in Lemmy?

In Lemmy, the active filter view is designed to prioritize posts with the latest activity, similar to how forums work. However, it remains unclear whether commenting on your own post in Lemmy will bump it on the active filter view. Some forum platforms, such as Discourse, allow a practice known as the “ghost bump,” where...

InternetPirate, 10 months ago

Testing.

join-lemmy.org/docs/…/03-votes-and-ranking.html

Edit: I was wrong the ranking that works like forums is New Comments and yes it seems to take into account the OP comments.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Seeking a Forgejo-hosted Mirror for Lemmy Outside the US

As an enthusiastic supporter of Lemmy, I am eager to contribute to the project. However, I hold strong reservations about writing a single line of code for a project hosted on a Micro$oft server. While I have created a few issues on GitHub, I firmly believe that my contributions could be significantly amplified if there were a...

(Rumored Leak of) GPT-4 Architecture, Infrastructure, Training Dataset, Costs, Vision, MoE (11.07.2023) (threadreaderapp.com)

GPT-4’s details are leaked....

Focused Transformer: Contrastive Training for Context Scaling - 256k context length AI (arxiv.org)

Summary:...

InternetPirate, 10 months ago

The paper actually demonstrates a 16-million context window with 92% accuracy. Most models can be retrained to have a 100k context window with over 92% accuracy, but the accuracy drops to 74% at 256k. The code has already been released on GitHub as well. I’m excited to see the development of 100k models using this method soon!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Where can I seek medical advice online?

Recently, I found myself questioning the accuracy of a diagnosis provided by a doctor I visited. Surprisingly, an AI seemed to offer a more insightful assessment. However, I understand the importance of not solely relying on AI-generated information. With that in mind, I’m eager to discover a reputable online platform where I...

InternetPirate, 10 months ago

Comparing current LLMs with autocomplete is stupid. An autocomplete can’t pass law or biology exams in the 90th percentile like GTP-4 can.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

InternetPirate, 10 months ago (edited 10 months ago)

You don’t have any idea of how GPT works. Read about it and then we can talk.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Price of insulin in China, Cuba, Laos, North Korea, and Vietnam?

Information on the price of insulin in China, Cuba, Laos, North Korea, and Vietnam is not readily available in the search results. However, there is information on the cost of insulin in some other countries. According to the World Population Review, the cost of insulin varies widely by country, with the United States having the...