High level search with PHP and Apache Solr

Posted on 23rd March 2010 by Nio in Java, Lucene, 程序人生 - Tags: , , ,

High level search with PHP and Apache Solr

When data sets get large and MySQL database querying to search become too load heavy and slow, full indexing is required. Several solutions are available but in this article I will be demonstrating the Apache foundations Solr Java Lucene implementation. For this a Java build will be required. Linux or Mac is less of a problem but for windows I use the Apache Tomcat server.

Faster, PHP! Kill! Kill!

Posted on 21st March 2010 by Nio in 程序人生 - Tags: , ,

Faster, PHP! Kill! Kill!

What most developers don’t realize is that there are three major factors that typically slow down PHP projects based on frameworks (like Symfony or, sigh, Drupal) so much that code profiling and database query redesign don’t even have a chance to become relevant factors. Fix these things first before you worry about other issues:

1. Compiling code over and over and over. Would you wait for your Mac to recompile MacOS X from source code every time you boot it up? Of course not. How about every time you fill out a dialog box? That’s pretty much what you’re doing every time you access a PHP-driven website that doesn’t use a bytecode cache.

2. Waiting and waiting and waiting for web browsers to make another request, pinning down web server processes that your other users need. By default Apache usually lets browsers hold on to a connection for up to 15 seconds just in case they ask for more. This is a good thing in many ways, but 15 seconds is far too long. Which leads us to #3:

3. Tying up a “fat” web server process with PHP on board for every request, even requests for the zillions of little static PNGs that probably make up your page design. (**) A typical Apache web server configuration with mod_php suffers from this flaw, fatally limiting the number of simultaneous users you can handle.

So what can we do about these problems? Quite a bit as it turns out. I’ll start with the low-hanging fruit and move on to the tougher stuff. The fascinating common thread with all of these suggestions: no changes at all to your PHP code.

PHP curl 抓取页面时的 cookie 问题

Posted on 28th October 2009 by Nio in 工作忙碌, 程序人生 - Tags: ,

使用 PHP curl 抓取页面时,可以设置 cookie 保存的文件,示例代码:


<?php
$cookie_path 'cookie.txt';
$ch curl_init();
curl_setopt($chCURLOPT_COOKIEFILE$cookie_path);
curl_setopt($chCURLOPT_COOKIEJAR$cookie_path);
//....
?>

特别需要注意的是,在完成抓取之后,需要把 cookie 文件删除,否则下次抓取时会自动使用原有的 cookie 数据,从而导致一些预想不到的错误(我们今天就被这个问题折腾了很久 :( )。

WinCache – Preliminary tests look REALLY good

Posted on 7th September 2009 by Nio in Cache, 程序人生 - Tags: ,

WinCache – Preliminary tests look REALLY good

Those of you who follow me on twitter know that recently, I tweeted that I had installed Microsoft’s new PHP Opcode Cache, WinCache on a test machine and didn’t see much difference in performance. I then later tweeted that it was probably due to my inexperience in managing II7 and not necessarily a failing of WinCache. In between those two posts, I received 2 messages from people working with Microsoft, the most helpful being from Ruslan Yakushev. If you recognize that name it’s because he writes a lot of good stuff over at iis.net including the getting started guide for WinCache.

Ruslan picked up on the tweet and wrote me a very nice “How can I help” email. It started a conversation that eventually let me to the problem I was having, but I’ve only just now had a chance to finish my rudimentary testing. I can now say that yes, it was my configuration that I had wrong and once I took Ruslan’s advice, I am seeing a tremendous improvement.

Easyrest Rest Framework

Posted on 26th August 2009 by Nio in 程序人生 - Tags: ,

Easyrest Rest Framework 1.0 Released (Client and Server Library)

What is the Easyrest?
Easyrest is a REST framework that contains client and server implementations.It has a easy structural data transfer unlike XML-RPC.Easyrest use a lot of pear packages and it has got custom apikey functionality.I think using the pear libraries is not a disadvantage because of you don’t have to install required pear libraries, easyrest can work from its own custom pear directory without any pear installation.