对AI炒股感兴趣的小伙伴可加WX:caihaihua057200(备注:学校/公司+名字+方向)
另外我还有些AI的应用可以一起研究(我一直开源代码)
1、引言
在这期内容中,我们回到AI预测股票,转而探索人工智能技术如何应用于另一个有趣的领域:预测A股大盘。
2、AI与股票的关系
在股票预测中,AI充当着数据分析和模式识别的角色。虽然无法确保百分之百准确的结果,但它为增加预测的洞察力和理解提供了全新的途径。
3、数据收集与处理(akshare爬实时上证指数)
import akshare as ak
import numpy as np
import pandas as pd
from pandas.tseries.offsets import CustomBusinessDay
from datetime import datetime
import xgboost as xgb
df = ak.stock_zh_index_daily_em(symbol='sh000001')
数据预处理:时间特征转换及时间特征结合K线特征
today = datetime.today()
date_str = today.strftime("%Y%m%d")
base = int(datetime.strptime(date_str, "%Y%m%d").timestamp())
change1 = lambda x: (int(datetime.strptime(x, "%Y%m%d").timestamp()) - base) / 86400
change2 = lambda x: (datetime.strptime(str(x), "%Y%m%d")).day
change3 = lambda x: datetime.strptime(str(x), "%Y%m%d").weekday()
df['date'] = df['date'].str.replace('-', '')
X = df['date'].apply(lambda x: change1(x)).values.reshape(-1, 1)
X_month_day = df['date'].apply(lambda x: change2(x)).values.reshape(-1, 1)
X_week_day = df['date'].apply(lambda x: change3(x)).values.reshape(-1, 1)
XX = np.concatenate((X, X_week_day, X_month_day), axis=1)[29:]
FT = np.array(df.drop(columns=['date']))
min_vals = np.min(FT, axis=0)
max_vals = np.max(FT, axis=0)
FT = (FT - min_vals) / (max_vals - min_vals)
window_size = 30
num_rows, num_columns = FT.shape
new_num_rows = num_rows - window_size + 1
result1 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):
window = FT[i: i + window_size]
window_mean = np.mean(window, axis=0)
result1[i] = window_mean
result2 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):
window = FT[i: i + window_size]
window_mean = np.max(window, axis=0)
result2[i] = window_mean
result3 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):
window = FT[i: i + window_size]
window_mean = np.min(window, axis=0)
result3[i] = window_mean
result4 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):
window = FT[i: i + window_size]
window_mean = np.std(window, axis=0)
result4[i] = window_mean
result_list = [result1, result2, result3, result4]
result = np.hstack(result_list)
XX = np.concatenate((XX, result), axis=1)
4、预测模型(XGboots)
y1 = df['open'][29:]
y2 = df['close'][29:]
y3 = df['high'][29:]
y4 = df['low'][29:]
models1 = xgb.XGBRegressor()
models2 = xgb.XGBRegressor()
models3 = xgb.XGBRegressor()
models4 = xgb.XGBRegressor()
models1.fit(XX, y1)
models2.fit(XX, y2)
models3.fit(XX, y3)
models4.fit(XX, y4)
5、应用及画图
start_date = pd.to_datetime(today)
bday_cn = CustomBusinessDay(weekmask='Mon Tue Wed Thu Fri')
future_dates = pd.date_range(start=start_date, periods=6, freq=bday_cn)
future_dates_str = [date.strftime('%Y-%m-%d') for date in future_dates][1:]
future_dates_str = pd.Series(future_dates_str).str.replace('-', '')
X_x = future_dates_str.apply(lambda x: change1(x)).values.reshape(-1, 1)
X_month_day_x = future_dates_str.apply(lambda x: change2(x)).values.reshape(-1, 1)
X_week_day_x = future_dates_str.apply(lambda x: change3(x)).values.reshape(-1, 1)
XXX = np.concatenate((X_x, X_week_day_x, X_month_day_x), axis=1)
last_column = result[-1:, ]
repeated_last_column = np.tile(last_column, (5, 1))
result = repeated_last_column
XXX = np.concatenate((XXX, result), axis=1)
pred1 = models1.predict(XXX)
pred2 = models2.predict(XXX)
pred3 = models3.predict(XXX)
pred4 = models4.predict(XXX)
y1 = np.array(df['open'][-30:])
y2 = np.array(df['close'][-30:])
y3 = np.array(df['high'][-30:])
y4 = np.array(df['low'][-30:])
YD = np.array(df['date'][-30:])
data = {
'open': np.concatenate([y1, pred1]),
'close': np.concatenate([y2, pred2]),
'high': np.concatenate([y3, pred3]),
'low': np.concatenate([y4, pred4]),
'date':np.concatenate([YD,np.array(future_dates_str)])
}
df = pd.DataFrame(data)
import mplfinance as mpf
# df['date'] = pd.date_range(start=RQ, periods=len(df))
df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace=True)
# mpf.plot(df, type='candle', title='Stock K-Line')
my_color = mpf.make_marketcolors(up='red', # 上涨时为红色
down='green', # 下跌时为绿色
# edge='i', # 隐藏k线边缘
# volume='in', # 成交量用同样的颜色
inherit=True)
my_style = mpf.make_mpf_style(
# gridaxis='both', # 设置网格
# gridstyle='-.',
# y_on_right=True,
marketcolors=my_color)
mpf.plot(df, type='candle',
style=my_style,
# datetime_format='%Y年%m月%d日',
title='Stock K-Line')
6、结果(预测下周上证:图中后五天是预测结果)
总结图中所示:
1、周一到周三略微上涨一点点。
2、下周四五高开高走(令人惊讶)。
如果提前布局的话应该是选择在周四找最低点买入。
全代码,一件运行:
import akshare as ak
import numpy as np
import pandas as pd
from pandas.tseries.offsets import CustomBusinessDay
from datetime import datetime
import xgboost as xgb
df = ak.stock_zh_index_daily_em(symbol='sh000001')
today = datetime.today()
date_str = today.strftime("%Y%m%d")
base = int(datetime.strptime(date_str, "%Y%m%d").timestamp())
change1 = lambda x: (int(datetime.strptime(x, "%Y%m%d").timestamp()) - base) / 86400
change2 = lambda x: (datetime.strptime(str(x), "%Y%m%d")).day
change3 = lambda x: datetime.strptime(str(x), "%Y%m%d").weekday()
df['date'] = df['date'].str.replace('-', '')
X = df['date'].apply(lambda x: change1(x)).values.reshape(-1, 1)
X_month_day = df['date'].apply(lambda x: change2(x)).values.reshape(-1, 1)
X_week_day = df['date'].apply(lambda x: change3(x)).values.reshape(-1, 1)
XX = np.concatenate((X, X_week_day, X_month_day), axis=1)[29:]
FT = np.array(df.drop(columns=['date']))
min_vals = np.min(FT, axis=0)
max_vals = np.max(FT, axis=0)
FT = (FT - min_vals) / (max_vals - min_vals)
window_size = 30
num_rows, num_columns = FT.shape
new_num_rows = num_rows - window_size + 1
result1 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):
window = FT[i: i + window_size]
window_mean = np.mean(window, axis=0)
result1[i] = window_mean
result2 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):
window = FT[i: i + window_size]
window_mean = np.max(window, axis=0)
result2[i] = window_mean
result3 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):
window = FT[i: i + window_size]
window_mean = np.min(window, axis=0)
result3[i] = window_mean
result4 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):
window = FT[i: i + window_size]
window_mean = np.std(window, axis=0)
result4[i] = window_mean
result_list = [result1, result2, result3, result4]
result = np.hstack(result_list)
XX = np.concatenate((XX, result), axis=1)
y1 = df['open'][29:]
y2 = df['close'][29:]
y3 = df['high'][29:]
y4 = df['low'][29:]
models1 = xgb.XGBRegressor()
models2 = xgb.XGBRegressor()
models3 = xgb.XGBRegressor()
models4 = xgb.XGBRegressor()
models1.fit(XX, y1)
models2.fit(XX, y2)
models3.fit(XX, y3)
models4.fit(XX, y4)
start_date = pd.to_datetime(today)
bday_cn = CustomBusinessDay(weekmask='Mon Tue Wed Thu Fri')
future_dates = pd.date_range(start=start_date, periods=6, freq=bday_cn)
future_dates_str = [date.strftime('%Y-%m-%d') for date in future_dates][1:]
future_dates_str = pd.Series(future_dates_str).str.replace('-', '')
X_x = future_dates_str.apply(lambda x: change1(x)).values.reshape(-1, 1)
X_month_day_x = future_dates_str.apply(lambda x: change2(x)).values.reshape(-1, 1)
X_week_day_x = future_dates_str.apply(lambda x: change3(x)).values.reshape(-1, 1)
XXX = np.concatenate((X_x, X_week_day_x, X_month_day_x), axis=1)
last_column = result[-1:, ]
repeated_last_column = np.tile(last_column, (5, 1))
result = repeated_last_column
XXX = np.concatenate((XXX, result), axis=1)
pred1 = models1.predict(XXX)
pred2 = models2.predict(XXX)
pred3 = models3.predict(XXX)
pred4 = models4.predict(XXX)
y1 = np.array(df['open'][-30:])
y2 = np.array(df['close'][-30:])
y3 = np.array(df['high'][-30:])
y4 = np.array(df['low'][-30:])
YD = np.array(df['date'][-30:])
data = {
'open': np.concatenate([y1, pred1]),
'close': np.concatenate([y2, pred2]),
'high': np.concatenate([y3, pred3]),
'low': np.concatenate([y4, pred4]),
'date':np.concatenate([YD,np.array(future_dates_str)])
}
df = pd.DataFrame(data)
import mplfinance as mpf
# df['date'] = pd.date_range(start=RQ, periods=len(df))
df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace=True)
# mpf.plot(df, type='candle', title='Stock K-Line')
my_color = mpf.make_marketcolors(up='red', # 上涨时为红色
down='green', # 下跌时为绿色
# edge='i', # 隐藏k线边缘
# volume='in', # 成交量用同样的颜色
inherit=True)
my_style = mpf.make_mpf_style(
# gridaxis='both', # 设置网格
# gridstyle='-.',
# y_on_right=True,
marketcolors=my_color)
mpf.plot(df, type='candle',
style=my_style,
# datetime_format='%Y年%m月%d日',
title='Stock K-Line')