Mediawiki notes

From Helpful
Jump to: navigation, search
These are primarily notes
It won't be complete in any sense.
It exists to contain fragments of useful information.

(See also Harvesting wikipedia)

Mediawiki is the software used in wikipedia, and in this wiki.


Setting up mediawiki

Roughly:

  • Install database
  • Set up apache
  • Unpack the mediawiki archive
  • Configure
  • Configure some more


Installing MySQL

There is postgres support now, and while I personally consider that a more mature database engine, I'm not sure how mature mediawiki's interface to it is.

There's probably a package manager installation you can use to install mysql, though there's always some configuration involved. There are plenty of FAQs out there.


Setting up apache

Nothing special. In my case, I run various hosts on the same server, so I set these things in vhosts. For example: the entire entry for this wiki used to be:

<VirtualHost 145.99.239.174>
  ServerName helpful.knobs-dials.com
  DocumentRoot /var/www/helpful
  ServerAdmin  me@example.com
  <Directory  /var/www/helpful>
     Options            None
     AllowOverride      None
     Order allow,deny
     Allow from All
  </Directory>
</VirtualHost>

When edited to your wishes, make apache reload its configuration. In my case:

/etc/init.d/apache2 reload

Unpacking the archive

You wouldn't be here if your distro had mediawiki in a package, right?

Go to the apache directory you want the wiki in, /var/www/helpful in the above case. (I usually do without the per-site division between icons and htdocs that apache uses).

Then unpack with something like:

tar xf mediawiki-1.3.5.tgz

You will probably have to change ownership on the files to be safe, usually something like the following will do:

chown apache:apache /var/www/helpful -R      # ...or whatever the applicable user is called on your system

Configuration: Wiki and database setup

You should chmod a+w the config directory:

chmod a+w config/

...so that the config script you're about to run can create a new configuration file.


When you surf to the URL you just installed on, it will tell you the wiki is not configured, and prompt you to do so.

As to the database part:

  • it's easiest if you have the mysql administrative login, as this will enable the setup to create a unique database for the wiki.
  • If your sysadmin gave you a database and login for the wiki, specify its details.
    • If the sysadmin only gives one database to each of its users, you may wish to use mediawiki's option to prefix the tables, to avoid name collisions.


After you submit this page and everything goes well, it will tell you that LocalSettings.php currently in the config/ subdirectory needs to be moved to the main directory.

At this point your wiki is usable.

To be safe, you should chmod go-w LocalSettings.php, otherwise people may be able to edit the settings file.

Some minor setup trouble

"The upload directory (public) is not writable by the webserver."

Mean the images/ directory isn't writeable by the webserver. (I'm not sure what 'public' refers to. There is no directory with that name in mediawiki tree)

Often you can chown/chgrp to change ownership (if needed),

and a
chmod -R g+w images
if you changed the group to the webserver group.


Mediawiki and PHP safe mode

imagemagick or not; "Error Creating Thumbnail: Unable to run external programs in Safe Mode"

Assuming that you can't avoid safe mode (e.g. shared hosting with their usual slight paranoid tendencies), then you'll probably want to change LocalSettings.php to set:

$wgUseImageMagick = false;

This means it'll use PHP's image functions instead of imagemagick, which means fewer file formats are supported -- but if PNG, GIF and JPG covers all your use then that's no problem).

See also http://www.mediawiki.org/wiki/Manual:$wgUseImageMagick


diffs; File(/usr/bin/diff3) is not within the allowed path(s)

Unsure; there may not be a way around this one.(verify)



Updating mediawiki

Mediawiki have a good resource: http://www.mediawiki.org/wiki/Manual:Upgrading


Summary:

  • You should back up the documentroot and the database it uses. I've done a handful of updates and never had it mess things up, but having a lot of data tied in some undefined half-updated state won't be fun.
  • If you have a vanilla install, and the only thing that is specific to your site is your LocalSettings.php, upgrading is very simple. If you have further customisation (extensions, altered style) then you'll need to do a little more work copying and/or updating them.
  • Most of the work: you can copy the newer version's directory structure over the older (overwrites many things, but not LocalSettings.php or uploaded files), run its maintenance/update.php from the shell to update the database, wait for it to finish, and be done.
    • update.php can update from fairly old versions. If you have something extremely old you may have to do it in multiple steps(verify), but typically you don't.
  • Database-wise: Depending on how you set up the database user's permissions, you may need to temporarily grant a little more while upgrading. For example, updating probably needs to CREATE things, which you may have restricted.
  • On a live site you may want to temporarily put up a maintenance message. (Consider this trick)
  • You typically want to check that things still work. Most things are tested if you browse around a few pages, edit one (perhaps preferably one that then uses an extension), upload a file.


Spam protection

If your wiki is public, shows up in search engines, and is editable by everyone, it'll eventually get spammed.


Content

There are various solutions, and which is handiest depends on how you use the wiki (and on the other spam protection you use).

You can disable anonymous edits and force people to register, log in, but that will probably scare potential editors away, which really isn't the point.


Spam filters are probably nicer. Some options:

  • a CAPTCHA, often using ConfirmEdit, or reCAPTCHA.
    • note most CAPTCHAs have been broken to some degree
    • can be made to trigger only under specific conditions, e.g.
      • to never bug admin edits
      • never bother logged-in users - which on lower-use wikis can combine well with account creation moderation (see below)
      • only trigger when new external (http) links are added
      • and more


  • block specific text patterns completely
    • consider hosts/domains commonly linked to, common medicine names, etc. There's $wgSpamRegex exists in the mediawiki core.
    • But you do need to know regular expressions
    • and it's time-consuming to be thorough with while not overly restrictive
    • ...you'ld probably want some text patterns to trigger captcha rather than block them


A combination of these and more may be best, particularly if you want to bother real users as little as possible.


Accounts

I now see a few new accounts per day. It seems reCAPTCHA is relatively broken.

On lower-used wikis, you can moderate account creation using the ConfirmAccount extension.


See also

For more details and further options, see:

Optional

TeX formatting

To get the wiki to use TeX formula formatting, you need you need ocaml (a functional language) installed, and then:

  • compile the code in math/. (run make)
  • set $wgUseTeX = true; in LocalSettings.php

The reason for this executable is protecting you from risky TeX. The app parses and filters the Tex. This also means it only allows basic math TeX, not advanced LaTeX commands.


Aesthetics

Changing the picture

That top left one there - it's handy to tell wikis apart.


In recent versions, you can set $wgLogo to the URL you want.


In older versions, you had to change the stylesheet or the image it pointed to (by default all skins point the same way, so changing the image could be handier).

I personally find it easiest to find the image, copy it out, edit it, and copy the result back in. This way it is the right size and starts off being transparent and save right (it's a PNG with alpha transparency).

The image's location changed a few times between versions:

  • 1.4: skins/common/images/wiki.png
  • 1.3: stylesheets/images/wiki.png


Altering the menu

Smaller wikis will probably have no use for 'current events' or 'community portal'. How to do this depends on the version.

For 1.5, you should visit the wiki's MediaWiki:Sidebar page. The list presented either is in the reference-to-page-with-url|reference-to-page-with-name form. These references are into the Mediawiki: namepsace, so for the Categories link on the left I added:

** categories-url|categories

And made the pages

  • Mediawiki:categories-url with contents: "Special:Categories"
  • Mediawiki:categories with contents: "Categories" (it seemed to exist already?)


For this and many similar alterations, see The wikiMedia FAQ and the Mediawiki meta-wiki.


Hiding that index.php

This is done with black magic, better known as mod_rewrite (see this), although there are other ways.

Maintenance

Backup

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Most of the information sits in the database, which means that a database backup will get most content, and you can restore such a backup into the database later.

In the case of mysql, this typically means a mysqldump. Be careful of character set corruption, though.


There are also a number of things stored on the filesystem, including:

  • LocalSettings.php
  • extensions (and their settings)
  • Uploaded files (the database will reference them).
  • ...more

A more complete backup inclides them too. The simplest solution is probably to tar the entire thing (the virtualhost's documentroot), though you can probably exclude things like the image cache(verify)



mediawiki and mangled characters

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

After quite a few upgrades I've had mangled characters.

Note that there is a big difference between text being mangled in storage, and data only being retrieved/presented incorrectly.

For mediawiki it's usually the second, because the frontend is UTF-8 throughout.


When you're upgrading/loading data from a rather old mediawiki version (which ones?(verify)) or from MySQL versions before 4.1, there are a number of details to know about to do the conversion properly. See e.g. http://www.mediawiki.org/wiki/Manual:Backing_up_a_wiki


Note that this is about database schema. No data changes when you upgrade. Particularly when updating between more recent installations, the likely problem is unwanted implicit database translation, such as having the wrong connection charset.

This may come down to looking at $wgDBmysql5 in your LocalSettings.php. Very roughly, if it's true then mediawiki assumes UTF-8 and if it's false it assumes latin1. If setting:

$wgDBmysql5 = true;  # possibly $wgDBmysql4 = false;, though it's deprecated in recent versions

...solves your problem, then yay. If you want to understand the problem, there's more reading to do (I'd appreciate a summary if you do:) ).



Shrinking the database

See also http://www.mediawiki.org/wiki/Manual:Reduce_size_of_the_database



Over time, most of the size of the database is old revisions, kept as text (uncompressed, for speed reasons).

The largest space user are usually old revisions, so compressing or even deleting them usually shrinks the database a lot.

You can:

  • compress old revisions
    • old articles are kept uncompressed by default for speed reasons. compressed revisions are slightly slower to use, but (depending on the amount of old revisions) may save a lot of database storage
    • Sometimes causes trouble (at least, I lost some pages in the process) -- so make a backup that is easy to restore. (TODO: Search for reports mentioning "Call to a member function uncompress() on a non-object")
    • done by a shell script
  • delete old revisions
    • you give up any and all ability to look back to and restore from old data (yes, you can save a database backup, but those are not that easy to use)
    • You can do this selectively or globally. See [1]
  • delete archived pages
    • ...which refers to deleted pages that have history, which are kept around in case you want to undelete them.
    • Usually little space compared to old revisions
    • Can be done with (
      TRUNCATE archive;
      to remove the references to archive entries, then a
      php maintenance/purgeOldText.php --purge
      to remove the related text content.



See also

Writing templates

See also [2]


Arguments

Using their values

You can use positional and named parameters:

  • positional: {{{1}}} (one-based counting)
  • named: {{{command}}}

References to things that are undefined are taken literally, so as a bunch of accoladed characters.

You usually want a default value, using the form
{{{1|}}}
, which in this case means an empty string.


For an example, to link to documentation with a configurable URL and an optional fragment, make the template something like:

[http://example.com/doc/{{{1}}}.html#{{{2|}}} Documentation for {{{1}}}]
which is then usable like
{{docs|progname}}
and like
{{docs|fileformats|inputfiles}}
. Of course, this does cheat a little with the #. You often have to start playing with more advanced logic - which is hard to do without mediawiki extensions for specifically that purpose.

Named arguments are fairly obvious too. For example, templates defined like:

Doc set: {{{1}}}
Command: {{{command}}}

can be used as:

{{docs|basic|command=sleep}}

Numeric parameters are essentially auto-named:

{{docs|1=basic|command=sleep}}

The difference between

{{test||foo}}
{{test|2=foo}}

...is that in the former, argument 1 is an empty string, whereas in the second it is undefined.


When passing in a pipe meant to be interpreted only as the results of the template, for example in a table row, use {{!}}. If you want it uninterpreted, use the HTML entity, |.


Character safety

When placing strings into URLs, you probably want to know about the urlencode: magic word so that spaces in arguments (and other bad characters) will not mess things up.

For example, The {{imagesearch}} template here contains:

[http://www.google.com/images?q=({{urlencode:{{{1|test}}}}}) {{{2|{{{1|image search}}}}}}]

The search term falls back to test (why not), the link text is the second argument if you have one, the query string if you did not


Actually, there are three urlencode variants: QUERY (the default), WIKI, and PATH. See http://www.mediawiki.org/wiki/Help:Magic_words#URL_data for details

Including help on template pages

On Template: pages, you can use <noinclude> (or <includeonly>) sections to have content that displays only on the template page itself, such as help and usage, but not use it when the template is called. See also [3].


Conditionals, logic

So far, it looks like templates very much originate as a hack, with almost no actual capabilities.

From the pages I've read so far, it seems anything actually useful sits in mediawiki extensions, primarily ParserFunctions.

Examples

Usable templates in this wiki

{{comment}} makes things a subtler color (like so)

<div style="display:inline; color: #445500">{{{1}}}</div>


{{inlinecode}} uses a
styled monospace for specifying commands or code
inline. You regularly have to wrap its contents in in a <nowiki> tag to avoid interpretation as template parameters. Defined as something like:
<div style="display:inline; background-color: #eee; border-top:1px solid #ccc; border-bottom:1px solid #ccc"><tt>{{{1}}}</tt></div>


{{stub}} and {{feelfree}} are boxes with some text and a link, defined using something like:

{| align=center style="background: #f9f9f9; border: 1px solid #aaa; padding: .2em; margin-bottom: 3px; font-size: 95%; width: auto;"
| style="padding-right: 4px; padding-left: 4px;" | '''This article (or section) is a [[Help:Editing#Templates|stub]].'''<br>It is here because it was planned, and some notes were dropped here. Feel free to add more notes or suggestions, or write an article.
|}


{{verify}} just shows '(verify)'. The purpose is to be able to search for these markers, which is why it links to a page called verify.

<em style="color:#aaa">([[verify]])</em>

See also