Announcement

Collapse
No announcement yet.

How to eliminate stray html code after double forum conversion?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to eliminate stray html code after double forum conversion?

    Hello,

    I bought vBulletin in late July as part of a massive upgrade program for an EZboard forum I was running. Since I could find no vBulletin script to convert the EZboard posts and user names (their databases are unavailable to forum administrators), I used a phpBB script that actually parses the html code of the EZboard display pages and creates tables and data for a phpBB2 forum based on that info. From that intermediate forum, I was then able to run the official phpbb2 to vBulletin scripts that would convert the intermediate forum to vBulletin.

    The phpbb parsing script worked well in terms of getting the bulk of important data over in a reasonable format (forum, thread, and post structures came over in tact). But it left some stray html/EZboard code in the text of every post. In many posts, it is merely a "</p>" tag at the end. In many others, there's lots of stray code from God knows what . . . links, quotes, emoticons, "edited by" addendums, etc. Here's a typical example.

    Is there a MySQL query or vBulletin script that I can run to get rid of this stuff without harming the actual content of the posts?
    Visit The Chase Lounge for some of the most insightful Sopranos discussions on the net!

  • #2
    You may be able to use the cleaner.php script in Impex to fix this:

    http://www.vbulletin.com/docs/html/main/impex_cleaner

    Note: We do provide a free EZBoard import service.
    Steve Machol, former vBulletin Customer Support Manager (and NOT retired!)
    Change CKEditor Colors to Match Style (for 4.1.4 and above)

    Steve Machol Photography


    Mankind is the only creature smart enough to know its own history, and dumb enough to ignore it.


    Comment


    • #3
      Originally posted by Steve Machol View Post
      You may be able to use the cleaner.php script in Impex to fix this:

      http://www.vbulletin.com/docs/html/main/impex_cleaner

      Note: We do provide a free EZBoard import service.
      Thanks for the tip. I will try the cleaner and report back.

      But I looked EVERYWHERE (so I thought) for a vBulletin converter for EZboard and could find none. (This was in July.) Where is it listed?

      Also, just curious . . . how could you import a forum -- which I interpret to mean converting/importing a database -- when an EZboard database is not available to admins? Is the vBulletin import essentially the same as the phpBB, an html parser that converts the posts from the display page? Does it provide the ability to import email addresses and passwords from the EZboard member tables?
      Visit The Chase Lounge for some of the most insightful Sopranos discussions on the net!

      Comment


      • #4
        You can request it here:

        http://www.vbulletin.com/members/import.php

        Click on the 'Request ezbord import' near the bottom of the page.

        My understanding is that the imported extracts the posts and threads from the URLs. This will import all of your posts, but not your member info since EZBoard does not provide access to that data.
        Steve Machol, former vBulletin Customer Support Manager (and NOT retired!)
        Change CKEditor Colors to Match Style (for 4.1.4 and above)

        Steve Machol Photography


        Mankind is the only creature smart enough to know its own history, and dumb enough to ignore it.


        Comment


        • #5
          Okay, I downloaded the latest ImpEx, modified the config file with my sql server info (even though I'm not importing a board -- that was done months ago), uploaded everything via ftp, and confirmed that Import was showing up in the admin cp.

          Since there is apparently no way to run the cleaner.php script from the cp, I entered the http path to the script itself in my browser (Firefox). I get nothing but a blank screen with the words "Not active" at the top. My board is, of course, turned off for the moment.

          What gives?

          ETA: I also checked CHMOD settings for the script file and the folders in its directory. All folders are 755 and the script itself is 644. I tried altering the CHMOD on the script to 755 and that did not help, so I put it back.
          Visit The Chase Lounge for some of the most insightful Sopranos discussions on the net!

          Comment


          • #6
            I'm moving this to the Import forum since that's an Impex script and Jerry should be the one to respond.
            Steve Machol, former vBulletin Customer Support Manager (and NOT retired!)
            Change CKEditor Colors to Match Style (for 4.1.4 and above)

            Steve Machol Photography


            Mankind is the only creature smart enough to know its own history, and dumb enough to ignore it.


            Comment


            • #7
              You need to edit the cleaner.php file so configure it with what you want it to clean out.

              I added a setting to the file :

              PHP Code:
              $active false
              So that it would have to be edited and the instructions read to get it working, also remove it and the rest of ImpEx when you are done. (i.e. that has to be set to true).

              You only need ImpExConfig.php and cleaner.php to run cleaner.
              I wrote ImpEx.

              Blog | Me

              Comment


              • #8
                Jerry,

                Just now finding the time to follow up on this and actually open the cleaner file for editing. I have a couple of questions, since this looks quite complex for a novice like myself:

                (1) I have many hundreds of posts with stray html. Replacing some <blockquote> tags with appropriate BB code tags is an easy and short enough process since the tags themselves don't vary from instance to instance. But what about anchor reference tags where you've got

                <A HREF="http://CouldBeAnythingHere.com">Visible Link Text</A>?

                I can't possibly go through and make a separate instruction for every variation of those tags. Is there a way to use an "*" or other wildcard in the array so that I can just get rid of the unsitely html tags and leave the Visible Link Text? While I'd like to still have the Visible Text serve as an active link, I'm not wed to that idea and am most concerned with getting rid of the tags.

                (2) If I set the script at its default, to ONLY search within and modify post text, and if I carefully configure the array in accordance with the instructions, is there any danger that this script will screw up any text or code OTHER than that which is defined on the left side of the array?
                Visit The Chase Lounge for some of the most insightful Sopranos discussions on the net!

                Comment


                • #9
                  1) All href tags should be converted by the standard ImpEx parsing of content.

                  2) The script will match anything it can on the left side of the array and replace it with the right, hence the warnings in the file and the constant reminders to back up your database.
                  I wrote ImpEx.

                  Blog | Me

                  Comment

                  widgetinstance 262 (Related Topics) skipped due to lack of content & hide_module_if_empty option.
                  Working...
                  X