Announcement

Collapse
No announcement yet.

Huge posts seem to kill rebuild searchindex

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stadler
    replied
    aah, right. I forgot that. Well, I've never used it for the ? quantifier.

    Leave a comment:


  • Mike Sullivan
    replied
    Originally posted by Stadler
    btw: why ("??) and not ("?)? Typo? Alias for {0,2}?
    The code is correct. It makes the "? greedy. Or in this case, prefer to match a tag with a " if it can. Otherwise you have problems with stripping URLs like .../file.php?arg[arr]=val, even when your URL tags have quotes around them.

    These changes will now appear in the next release.

    Leave a comment:


  • Stadler
    replied
    Works both in the test-script and on my local test-board and seems to alter the links, as it is intended to.

    btw: why ("??) and not ("?)? Typo? Alias for {0,2}?

    Leave a comment:


  • Mike Sullivan
    replied
    Can those of you who are having problems try these replacement regexes?

    In includes/functions.php, find:

    Code:
      // simple links
      $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]\\3\[/\\1\]#si';
      $replace[] = '\3';
      // named links
      $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]([^\]^\[]+)\[/\\1\]#si';
      $replace[] = '\4 (\3)';
    Replace these with:

    Code:
    // simple links
    $find[] = '#\[(email|url)=("??)(.+)\\2\]\\3\[/\\1\]#siU';
    $replace[] = '\3';
    // named links
    $find[] = '#\[(email|url)=("??)(.+)\\2\](.+)\[/\\1\]#siU';
    $replace[] = '\4 (\3)';

    Leave a comment:


  • Mike Sullivan
    replied
    I haven't had a chance to test it yet -- it might not be a PHP/PCRE bug, but just that it turns into a very slow regex in some cases.

    Leave a comment:


  • Stadler
    replied
    Originally posted by Ed Sullivan
    *grumbles* Won't even let me do cool regex things!

    Wayne emailed me the script so I'll see what I can do.
    Did you report the bug @ php.net?

    Leave a comment:


  • Stadler
    replied
    Anyway: It's not recommended to try this on a live environment, since the regex could contain errors, though I've done some debugging. I haven't applied it to our forums either. I'm waiting till a dev confirms it, before I'll apply it online.

    Leave a comment:


  • Marc Smith
    replied
    So far it appears to be the answer to my indexing problem. The couple of posts I tried it on indexed normally.

    Leave a comment:


  • Mike Sullivan
    replied
    Originally posted by Stadler
    It's the conditional, that causes the crash.
    *grumbles* Won't even let me do cool regex things!

    Wayne emailed me the script so I'll see what I can do.

    Leave a comment:


  • Stadler
    replied
    It's the conditional, that causes the crash.

    Quick and dirty fix:

    Warning: It is recommended, that you don't use this on a live-server unless a dev has confimed this. And I mean dev. Not another user

    IN includes/functions.php FIND
    PHP Code:
            // simple links
            
    $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]\\3\[/\\1\]#si';
            
    $replace[] = '\3';

            
    // named links
            
    $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]([^\]^\[]+)\[/\\1\]#si';
            
    $replace[] = '\4 (\3)';

            
    // replace links (and quotes if specified) from message
            
    $message preg_replace($find$replace$message); 
    REPLACE IT WITH
    PHP Code:
            $find = array(
                
    '#\[(email|url)\]"([^"]+)"\[/\\1\]#i'// simple links surrounded by quotes
                
    '#\[(email|url)\]([^\]\[]+)\[/\\1\]#i'// simple links not surrounded by quotes
                
    '#\[(email|url)="([^"]+)"\]([^\]^\[]+)\[/\\1\]#i'// named links surrounded by quotes
                
    '#\[(email|url)=([^\]\[]+)\]([^\]^\[]+)\[/\\1\]#i' // named links not surrounded by quotes
            
    );

            
    $replace = array(
                
    '\2'// simple links surrounded by quotes
                
    '\2'// simple links not surrounded by quotes
                
    '\3 (\2)'// named links surrounded by quotes
                
    '\3 (\2)' // named links not surrounded by quotes
            
    );

            
    // replace links (and quotes if specified) from message
            
    $message preg_replace($find$replace$message); 
    And I think, I've fixed another bug with this, since
    PHP Code:
            // simple links
            
    $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]\\3\[/\\1\]#si';
            
    $replace[] = '\3'
    should be
    PHP Code:
            // simple links
            
    $find '#\[(email|url)\]("?)((?(2)[^"]|[^\]\[])+)\\2\[/\\1\]#i';
            
    $replace '\3'
    btw: The s-switch (PCRE_DOTALL): Is that really needed? There are no dots in the regex.
    Last edited by Stadler; Mon 16th Jun '03, 11:53am. Reason: s-switches removed From replace-codes

    Leave a comment:


  • Marc Smith
    replied
    Originally posted by Stadler
    Its been caused by the regexes to replace the url-tags in the text.
    I believe that may be a commonality between us. As I remember when I looked at some posts yesterday each had at least one complex url.

    But - one other thing I noticed in my case - I can not edit the post or delete the post or delete the thread.
    Last edited by Marc Smith; Mon 16th Jun '03, 6:50am.

    Leave a comment:


  • Stadler
    replied
    ok, done that. Check your eMails, Wayne

    Leave a comment:


  • Marc Smith
    replied
    I'll be getting all that together. I can say that I can download my database backup to my Mac and 'run' my 2.3.0 forums without incident including re-indexing. I'll get the Apache versions and such - I've got a spreadsheet going with details. It will be a day before I get it finished.

    At present, I'm assuming it is not a parser or other vB issue or there would be a lot more posts in this thread. Nor is it a show stopper for me. More of an irritant.

    I do know for me it's not a memory or disk space issue.

    Leave a comment:


  • Wayne Luke
    replied
    You can email it to me and I will forward it to one of the developers.

    wayne_at_vbulletin.com

    Leave a comment:


  • Stadler
    replied
    I've managed to reproduce the crash, by extracting the neccessary functions into a test script. And I could hunt down the cause of it: Its been caused by the regexes to replace the url-tags in the text:
    PHP Code:
            // simple links
            
    $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]\\3\[/\\1\]#si';
            
    $replace[] = '\3';

            
    // named links
            
    $find[] = '#\[(email|url)=("?)((?(2)[^"]|[^\]\[])+)\\2\]([^\]^\[]+)\[/\\1\]#si';
            
    $replace[] = '\4 (\3)';

            
    // replace links (and quotes if specified) from message
            
    $message preg_replace($find$replace$message); 
    btw: My WAMP-Specs are:
    Win2k SP3
    Apache 2.0.45
    PHP 4.3.2

    with minor changes to the php.ini-recommended and httpd.conf

    Shall I email the script to you or send it though IRC @#vBorg? It's containing one of the posts causing the problems.

    Leave a comment:

widgetinstance 262 (Related Topics) skipped due to lack of content & hide_module_if_empty option.
Working...
X