Python 读取 txt 数据存 mangodb - V2EX

首页注册登录

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

推荐学习书目

› Learn Python the Hard Way

Python Sites

› PyPI - Python Package Index

› http://diveintopython.org/toc/index.html

› Pocoo

值得关注的项目

› PyPy

› Celery

› Jinja2

› Read the Docs

› gevent

› pyenv

› Stackless Python

› Beautiful Soup

› 结巴中文分词

› Green Unicorn

› Sentry

› Shovel

› pytest

Python 编程

› pep8 Checker

Styles

› PEP 8

› Google Python Style Guide

› Code Style from The Hitchhiker's Guide

这是一个创建于 866 天前的主题，其中的信息可能已经有所发展或是发生改变。

我现在的读取存方法太慢了,希望大佬能给点建议提升效率.

def file_name(url):
for root, dirs, files in os.walk(url):
for i in range(len(files)):
f = open('c:/test/txt/file/' + files[i][:files[i].index('.txt')] + '.txt')
next(f)
for j in f:
res = j.split(' ')
obj = {}
for k in res:
if len(res) > 2:
obj = {
"trade_date": res[0],
"time": res[1],
"open": float(res[2]) if res[2] else '',
"high": float(res[3]) if res[3] else '',
"low": float(res[4]) if res[4] else '',
"close": float(res[5]) if res[5] else '',
"vol": float(10)
}
if obj and obj['trade_date'] != '':
name = files[i][:files[i].index('.txt')]
origin = client['abc'][name]
origin.insert_one(obj)
# print()
print(i, files[i][:files[i].index('.txt')], 'end')

file_name('c:/test/txt/file/')

txt 文件:
B2210 豆二 2210 1 分钟线前复权
日期时间开盘最高最低收盘成交量持仓量结算价
20211025 0901 3853 3853 3853 3853 0 0 0

7 条回复 • 2022-08-05 10:39:04 +08:00

1

kaedeair

2022-08-05 09:19:09 +08:00

任务切块扔进程池

2

tairan2006

2022-08-05 09:27:03 +08:00

1

你不要一条一条插啊…批量写入

3

CaptainD

2022-08-05 09:53:16 +08:00

用 pandas 批量处理 txt ，然后批量构建 obj ，再插入，然后再分块多进程，这个思路是不是好点

4

vhysug01

2022-08-05 09:57:15 +08:00

这类型数据我存过，用的是 bcolz 存的，文件形式落盘

5

httplife

2022-08-05 10:00:47 +08:00

空白键换成 , 另存为 csv. 然后 import?

6

root000

2022-08-05 10:37:48 +08:00

https://www.mongodb.com/docs/database-tools/mongoimport/

可以试试官方的工具但是需要转下文件格式目前只支持 JSON, CSV, or TSV

目前在用这个来导

7

SenLief

2022-08-05 10:39:04 +08:00

把 txt 先整成 csv ，那就效率高了。

关于 · 帮助文档 · 博客 · API · FAQ · 实用小工具 · 3266 人在线 最高记录 6679 ·

Select Language

创意工作者们的社区

World is powered by solitude

VERSION: 3.9.8.5 · 23ms · UTC 11:50 · PVG 19:50 · LAX 03:50 · JFK 06:50
Developed with CodeLauncher
♥ Do have faith in what you're doing.