yahoo finance係咪會download bulk data中途斷左

1.

60 回覆
1 Like 13 Dislike
NLV 2022-06-06 15:00:49
要高質不如俾近二千蚊港紙每月subscribe to eSignal
CCSTOYS苦主 2022-06-07 09:13:55
1分鍾1隻禁慢 ...
我唔洗十分鍾行曬2XX隻
有無SET錯野
1. 2022-06-07 10:26:13
已經轉左其他方法
張珈晴 2022-06-07 14:41:35
你嗰period 幾耐
應該可以set 10秒都得
1. 2022-06-07 19:29:36
multireading十零秒得 不過有時要重刷
有code share都好歡迎
GOGL 2022-06-07 21:52:04
我n年前 download david webb data
都成日斷
1. 2022-06-21 23:06:52
研究一排原來有光速方法 又要改過code
1. 2022-06-21 23:17:49
另一隻yahoo api
CapaCitor 2022-06-21 23:34:02
我10-15mins download 1500+隻股
算快定慢
CCSTOYS苦主 2022-06-21 23:42:11
師兄分享下
1. 2022-06-22 00:35:41
上面已有tip
Flinty 2022-06-22 01:40:41
import numpy as np
import pandas as pd

你去Yahoo Finance,搵條download link 更快。

例子:[url]finance.yahoo/quote/AAPL/history?p=AAPL
[/url]

之後開Google Chrome Developer Tool,搵番條download link href 就okay。

from pyspark import SparkFiles
from pyspark import SparkContext
from pyspark.sql import functions
import pyspark.sql.functions #import avg, col, udf
from pyspark.sql import SQLContext
from pyspark.sql import DataFrame
from pyspark.sql.types import *

import json
import urllib3
import chardet
import requests

aapl_href="<Yahoo Finance download link href>"

df_aapl= spark.createDataFrame(pd.read_csv(aapl_href))


依個方法就比較快。



如果想一次過extract一堆,就要hard code list of tickers
Flinty 2022-06-22 01:41:46
import numpy as np
import pandas as pd

from pyspark import SparkFiles
from pyspark import SparkContext
from pyspark.sql import functions
import pyspark.sql.functions #import avg, col, udf
from pyspark.sql import SQLContext
from pyspark.sql import DataFrame
from pyspark.sql.types import *

import json
import urllib3
import chardet
import requests

aapl_href="<Yahoo Finance download link href>"

df_aapl= spark.createDataFrame(pd.read_csv(aapl_href))
Flinty 2022-06-22 04:51:47
如果你打算大量import financial data,但你又得dataframe,無其他儲存體(eg. SQL database, Data Lake)。咁你一定要download data到你local 電腦,方便你下一次重新開電腦 即時拎黎用。


類似係咁寫。(個code未試過run)

個url只係for download list of tickets,但時間個啲hard code左。時間要改的話,上面要hard code番時間,之後加插番到個url入面。時間應該係1624307 個啲。(睇唔明佢date format)
Flinty 2022-06-22 04:51:58
import numpy as np
import pandas as pd
import json
import urllib3
import chardet
import requests

ticker_list = ["AAPL","AAL","A","SPY"]
url_lsit=[]


for i in ticker_list:
    ticker_list_concat = 'https://query1.finance.yahoo.com/v7/finance/download/' + str(i) + '?period1=1624307137&amp;period2=1655843137&amp;interval=1d&amp;events=history&amp;includeAdjustedClose=true'
    tickerlist.append(ticker_list_concat)

    urlfile = requests.get(tickerlist)

    #Create data frame form reuqesting URL
    df = pd.read_csv(urlfile)

    df.to_csv(r'C:\Users\Download\' +str(i) + '.csv')
carlam 2022-06-22 04:58:19
你彈出個error係咪"YAHOO finance is out of service"?
Outliers 2022-06-22 05:42:58
屌你人地問你就答啦,tip乜撚嘢
你自己有嘢唔識擺上嚟問,大家就認真答你。然後大家問返你嘢,你就收收埋埋,仲要話已經俾咗提示咁撚串,你份人咁仆街嘅

如果你真係想收埋,你索性就唔好提有光速方法啦
又要暗示自己有好勁嘅方法,然後又唔撚講,到底係咩心態
1. 2022-06-22 08:34:48
遲d試
1. 2022-06-22 08:38:08
除左flinty邊個認真答
你自己有料咪share 有咩資格嘈
正一on99
1. 2022-06-22 08:47:00
一多ticker就error
Flinty 2022-06-22 09:29:39
個error message 係咩?
Flinty 2022-06-22 09:39:24
我今晚放工 再幫你睇睇

睇番之前啲comments,你話過request太密 導致。



可能要加 time.sleep(),去加time delay。
Outliers 2022-06-22 09:56:05
樓主已經搞掂咗,仲要係光速方法
你唔洗幫佢睇啦
吹水台自選台熱 門最 新手機台時事台政事台World體育台娛樂台動漫台Apps台遊戲台影視台講故台健康台感情台家庭台潮流台美容台上班台財經台房屋台飲食台旅遊台學術台校園台汽車台音樂台創意台硬件台電器台攝影台玩具台寵物台軟件台活動台電訊台直播台站務台黑 洞