阅读背景:

GAE python线程不能并行执行。

来源:互联网 

I am trying to create a simple web app using Python on GAE. The app needs to spawn some threads per request received. For this I am using python's threading library. I spawn all the threads and then wait on them.

我正在尝试在GAE上使用Python创建一个简单的web应用程序。应用程序需要为每个收到的请求生成一些线程。为此,我正在使用python的线程库。我生成所有的线程,然后等待它们。

t1.start()
t2.start()
t3.start()

t1.join()
t2.join()
t3.join()

The application runs fine except for the fact that the threads are running serially rather than concurrently(confirmed this by printing the timestamps at the beginning/end of each thread's run() method). I have followed the instructions given in https://code.google.com/appengine/docs/python/python27/using27.html#Multithreading to enable multithreading

应用程序运行良好,除了线程是串行运行而不是并发运行(通过在每个线程的run()方法的开头/结尾打印时间戳来确认这一点)。我已经按照https://code.google.com/appengine/docs/python/python27/using27.html#多线程给出的说明来启用多线程

My app.yaml looks like:

我的app.yaml看起来像:

application: myapp
version: 1
runtime: python27
api_version: 1
threadsafe: true

handlers:
- url: /favicon\.ico
  static_files: favicon.ico
  upload: favicon\.ico

- url: /stylesheet
  static_dir: stylesheet

- url: /javascript
  static_dir: javascript

- url: /pages
  static_dir: pages

- url: .*
  script: main.app

I made sure that my local GoogleAppLauncher uses python 2.7 by setting the path explicitly in the preferences.

通过在首选项中显式设置路径,确保本地GoogleAppLauncher使用python 2.7。

My threads have an average run-time of 2-3 seconds in which they make a url open call and do some processing on the result.

我的线程平均运行时间为2-3秒,在此期间,它们进行url打开调用并对结果进行一些处理。

Am I doing something wrong, or missing some configuration to enable multithreading?

是我做错了什么,还是缺少一些配置来启用多线程?

3 个解决方案

#1


17  

Are you experiencing this in the dev_appserver or after uploading your app to the production service? From your mention of GoogleAppLauncher it sounds like you may be seeing this in the dev_appserver; the dev_appserver does not emulate the threading behavior of the production servers, and you'd be surprised to find that it works just fine after you deploy your app. (If not, add a comment here.)

您是在dev_appserver中体验到这一点,还是在将应用程序上传到产品服务之后体验到这一点?从你提到的GoogleAppLauncher中,你可能会在dev_appserver中看到;dev_appserver没有模拟生产服务器的线程行为,您会惊讶地发现,它在部署应用程序之后运行良好(如果没有,请在这里添加一条注释)。

Another idea: if you are mostly waiting for the urlfetch, you can run many urlfetch calls in parallel by using the async interface to urlfetch: https://code.google.com/appengine/docs/python/urlfetch/asynchronousrequests.html

另一个想法是:如果您大部分时间都在等待urlfetch,您可以通过使用urlfetch: https://code.google.com/appengine/docs/python/urlfetch/asynchronousrequests.html的异步接口来并行运行许多urlfetch调用

This approach does not require threads. (It still doesn't properly parallelize the requests in the dev_appserver; but it does do things properly on the production servers.)

这种方法不需要线程。(在dev_appserver中,它仍然不能正确地并行化请求;但它在生产服务器上确实做得很好。

#2


1  

The multithreading notes for GAE are merely for how requests are handled - they don't fundamentally change how Python threads work. Specifically, the "CPython Implementation Detail" note in the threading module docs still applies.

GAE的多线程说明仅仅是针对如何处理请求—它们不会从根本上改变Python线程的工作方式。具体来说,线程模块文档中的“CPython实现细节”说明仍然适用。

It's also worth mentioning the note in the "Sandboxing" section of the GAE docs:

同样值得一提的是GAE文档中的“Sandboxing”部分中的注释:

Note that threads will be joined by the runtime when the request ends, so the threads cannot run past the end of the request.

注意,当请求结束时,线程将由运行时加入,因此线程不能运行到请求结束之后。

#3


0  

If your threads are mostly waiting for datastore operations, you may try the NDB module that's part of 1.6.2. The semantics will be close enough to what you are doing.

如果您的线程主要在等待数据存储操作,您可以尝试NDB模块,它是1.6.2的一部分。语义将非常接近于您正在做的事情。

IIRC, the multithreading flag enables one server instance to serve multiple requests on separate threads, but won't allow you to start threads yourself. If you didn't need to sync them before returning, you could put them on separate tasks and delegate them to one or more task queues.

IIRC,多线程标记允许一个服务器实例在单独的线程上服务多个请求,但不允许您自己启动线程。如果在返回之前不需要同步它们,可以将它们放在单独的任务上,并将它们委托给一个或多个任务队列。


分享到: