Announcement

Collapse
No announcement yet.

Huge posts seem to kill rebuild searchindex

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    This sounds more like an issue with local setups than it does with vBulletin...

    Are you using the same Apache builds on your local server as on your live server? MySQL? PHP?
    Same amount of Memory on both machines?
    Same httpd files? Or as close to the same as possible
    Mysql Optimized equally well on both servers?
    PHP Compiled the same?
    Etc...
    Translations provided by Google.

    Wayne Luke
    The Rabid Badger - a vBulletin Cloud demonstration site.
    vBulletin 5 API - Full / Mobile
    Vote for your favorite feature requests and the bugs you want to see fixed.

    Comment


    • #17
      No, most parts of that should be different. And I don't have access to modify these.

      But i've found posts, that are much bigger, than the posts, that resulted in the crash. Maybe, this is a problem with the parser?
      Hints & Tips:
      [[vB3] More Spiders / Indexers / Archives for vB3 - list]|[List of one-time-emails to ban]


      http://sfx-images.mozilla.org/affili...efox_80x15.png

      Comment


      • #18
        If it was a problem with the parser surely it would happen consistently across platforms and servers.

        My localhost is an almost exact duplicate of my server. The only differences are kernel version (2..4.18-27.7.x vs. 2.4.20-8) and the server has 1 GB RAM and my local machine has 512 MB Ram.
        Translations provided by Google.

        Wayne Luke
        The Rabid Badger - a vBulletin Cloud demonstration site.
        vBulletin 5 API - Full / Mobile
        Vote for your favorite feature requests and the bugs you want to see fixed.

        Comment


        • #19
          Originally posted by Wayne Luke
          If it was a problem with the parser surely it would happen consistently across platforms and servers.
          That could be a cause, but why? If I have a forum on a server and I switch ISPs, it should only be a matter of setting up the software on the new server and then sucking in the backup .sql file into the database and away you go. As long as php and mySQL requirements are met, this should be server independent.

          I've had about 40 posts fail out of 51K indexed so far. I plan to watch the forums here until next Saturday. I can do my backups and try it on my live server and switch back if indexing fails there. If they don't index there, then I'll agree. I would like to determine the failure mechanism.

          When I finish indexing on my testbed, I'll look at the posts on the live board and see if there's anything that stands out. I am quite sure it doesn't relate to post length.

          Comment


          • #20
            It would be a lot easier to determine the failure mechanism if people gave some basic information about the problem...

            For instance, we should have the OS, Apache, MySQL, PHP versions, installed memory, and other relevant information about your test and live servers to see if there is any correlation. We should have access to the troublesome posts raw format so we can test it ourselves.

            Telling us it fails, doesn't help us solve anything. We need to know exactly what is failing, when and why. Giving the above information will help. For all we know it could be a problem with the Binary distribution of Apache under Windows. Or the fact that you are using MySQL-Max under Windows because it requires more memory. Maybe you compiled PHP with a different optimization flag that is causing the problem. Maybe PHPA or Zend Optimizer which is setup differently on both computers. Maybe it is vBulletin but until we can draw a correlation, everything is pure speculation and as long as we don't get information then that is all we have to go on speculation.
            Translations provided by Google.

            Wayne Luke
            The Rabid Badger - a vBulletin Cloud demonstration site.
            vBulletin 5 API - Full / Mobile
            Vote for your favorite feature requests and the bugs you want to see fixed.

            Comment


            • #21
              I've managed to reproduce the crash, by extracting the neccessary functions into a test script. And I could hunt down the cause of it: Its been caused by the regexes to replace the url-tags in the text:
              PHP Code:
                      // simple links
                      
              $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]\\3\[/\\1\]#si';
                      
              $replace[] = '\3';

                      
              // named links
                      
              $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]([^\]^\[]+)\[/\\1\]#si';
                      
              $replace[] = '\4 (\3)';

                      
              // replace links (and quotes if specified) from message
                      
              $message preg_replace($find$replace$message); 
              btw: My WAMP-Specs are:
              Win2k SP3
              Apache 2.0.45
              PHP 4.3.2

              with minor changes to the php.ini-recommended and httpd.conf

              Shall I email the script to you or send it though IRC @#vBorg? It's containing one of the posts causing the problems.
              Hints & Tips:
              [[vB3] More Spiders / Indexers / Archives for vB3 - list]|[List of one-time-emails to ban]


              http://sfx-images.mozilla.org/affili...efox_80x15.png

              Comment


              • #22
                You can email it to me and I will forward it to one of the developers.

                wayne_at_vbulletin.com
                Translations provided by Google.

                Wayne Luke
                The Rabid Badger - a vBulletin Cloud demonstration site.
                vBulletin 5 API - Full / Mobile
                Vote for your favorite feature requests and the bugs you want to see fixed.

                Comment


                • #23
                  I'll be getting all that together. I can say that I can download my database backup to my Mac and 'run' my 2.3.0 forums without incident including re-indexing. I'll get the Apache versions and such - I've got a spreadsheet going with details. It will be a day before I get it finished.

                  At present, I'm assuming it is not a parser or other vB issue or there would be a lot more posts in this thread. Nor is it a show stopper for me. More of an irritant.

                  I do know for me it's not a memory or disk space issue.

                  Comment


                  • #24
                    ok, done that. Check your eMails, Wayne
                    Hints & Tips:
                    [[vB3] More Spiders / Indexers / Archives for vB3 - list]|[List of one-time-emails to ban]


                    http://sfx-images.mozilla.org/affili...efox_80x15.png

                    Comment


                    • #25
                      Originally posted by Stadler
                      Its been caused by the regexes to replace the url-tags in the text.
                      I believe that may be a commonality between us. As I remember when I looked at some posts yesterday each had at least one complex url.

                      But - one other thing I noticed in my case - I can not edit the post or delete the post or delete the thread.
                      Last edited by Marc Smith; Mon 16th Jun '03, 6:50am.

                      Comment


                      • #26
                        It's the conditional, that causes the crash.

                        Quick and dirty fix:

                        Warning: It is recommended, that you don't use this on a live-server unless a dev has confimed this. And I mean dev. Not another user

                        IN includes/functions.php FIND
                        PHP Code:
                                // simple links
                                
                        $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]\\3\[/\\1\]#si';
                                
                        $replace[] = '\3';

                                
                        // named links
                                
                        $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]([^\]^\[]+)\[/\\1\]#si';
                                
                        $replace[] = '\4 (\3)';

                                
                        // replace links (and quotes if specified) from message
                                
                        $message preg_replace($find$replace$message); 
                        REPLACE IT WITH
                        PHP Code:
                                $find = array(
                                    
                        '#\[(email|url)\]"([^"]+)"\[/\\1\]#i'// simple links surrounded by quotes
                                    
                        '#\[(email|url)\]([^\]\[]+)\[/\\1\]#i'// simple links not surrounded by quotes
                                    
                        '#\[(email|url)="([^"]+)"\]([^\]^\[]+)\[/\\1\]#i'// named links surrounded by quotes
                                    
                        '#\[(email|url)=([^\]\[]+)\]([^\]^\[]+)\[/\\1\]#i' // named links not surrounded by quotes
                                
                        );

                                
                        $replace = array(
                                    
                        '\2'// simple links surrounded by quotes
                                    
                        '\2'// simple links not surrounded by quotes
                                    
                        '\3 (\2)'// named links surrounded by quotes
                                    
                        '\3 (\2)' // named links not surrounded by quotes
                                
                        );

                                
                        // replace links (and quotes if specified) from message
                                
                        $message preg_replace($find$replace$message); 
                        And I think, I've fixed another bug with this, since
                        PHP Code:
                                // simple links
                                
                        $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]\\3\[/\\1\]#si';
                                
                        $replace[] = '\3'
                        should be
                        PHP Code:
                                // simple links
                                
                        $find '#\[(email|url)\]("?)((?(2)[^"]|[^\]\[])+)\\2\[/\\1\]#i';
                                
                        $replace '\3'
                        btw: The s-switch (PCRE_DOTALL): Is that really needed? There are no dots in the regex.
                        Last edited by Stadler; Mon 16th Jun '03, 11:53am. Reason: s-switches removed From replace-codes
                        Hints & Tips:
                        [[vB3] More Spiders / Indexers / Archives for vB3 - list]|[List of one-time-emails to ban]


                        http://sfx-images.mozilla.org/affili...efox_80x15.png

                        Comment


                        • #27
                          Originally posted by Stadler
                          It's the conditional, that causes the crash.
                          *grumbles* Won't even let me do cool regex things!

                          Wayne emailed me the script so I'll see what I can do.

                          Comment


                          • #28
                            So far it appears to be the answer to my indexing problem. The couple of posts I tried it on indexed normally.

                            Comment


                            • #29
                              Anyway: It's not recommended to try this on a live environment, since the regex could contain errors, though I've done some debugging. I haven't applied it to our forums either. I'm waiting till a dev confirms it, before I'll apply it online.
                              Hints & Tips:
                              [[vB3] More Spiders / Indexers / Archives for vB3 - list]|[List of one-time-emails to ban]


                              http://sfx-images.mozilla.org/affili...efox_80x15.png

                              Comment


                              • #30
                                Originally posted by Ed Sullivan
                                *grumbles* Won't even let me do cool regex things!

                                Wayne emailed me the script so I'll see what I can do.
                                Did you report the bug @ php.net?
                                Hints & Tips:
                                [[vB3] More Spiders / Indexers / Archives for vB3 - list]|[List of one-time-emails to ban]


                                http://sfx-images.mozilla.org/affili...efox_80x15.png

                                Comment

                                widgetinstance 262 (Related Topics) skipped due to lack of content & hide_module_if_empty option.
                                Working...
                                X