weka连接jdbc数据库

  1. 1 weka简介

Weka的全名是怀卡托智能分析环境(Waikato Environment for Knowledge Analysis),是一款免费的,非商业化(与之对应的是SPSS公司商业数据挖掘产品--Clementine )的,基于JAVA环境下开源的机器学习(machine learning)以及数据挖掘(data mining)软件。

2 weka的下载与安装

首先需要到官网下载https://www.cs.waikato.ac.nz/ml/weka/downloading.html,分别有windowns32\64位、mac ios、Linux等,其中windows版本32、64位分别有包含jre和不包含jre版本,这个根据电脑是否安装jdk\jre而定。

weka连接jdbc数据库
下载之后,直接点击exe文件,傻瓜式安装,直接下一步,直至安装完成!

3  数据库文件的配置

weka是一个比较强大的数据挖掘分析工具,支持mysql、pg、oracle等数据库,只需要修改里面的配置文件即可。
解压安装文件目录里面的weka.jar解压打开,找到weka\experiment 文件夹下的DatabaseUtils.props(连接mysql)或DatabaseUtils.props.mysql(连接mysql)进行修改:
# Database settings for MySQL 3.23.x, 4.x
#
# General information on database access can be found here:
# http://weka.wikispaces.com/Databases
#
# url:     http://www.mysql.com/
# jdbc:    http://www.mysql.com/products/connector/j/
# author:  Fracpete (fracpete at waikato dot ac dot nz)
# version: $Revision: 5836 $


# JDBC driver (comma-separated list)
#jdbcDriver=org.gjt.mm.mysql.Driver
jdbcDriver=com.mysql.jdbc.Driver;#数据库驱动


# database URL
#jdbcURL=jdbc:mysql://server_name:3306/database_name
jdbcURL=jdbc:mysql://localhost:3306/数据库名;#数据库url
# specific data types
# string, getString() = 0;    --> nominal
# boolean, getBoolean() = 1;  --> nominal
# double, getDouble() = 2;    --> numeric
# byte, getByte() = 3;        --> numeric
# short, getByte()= 4;        --> numeric
# int, getInteger() = 5;      --> numeric
# long, getLong() = 6;        --> numeric
# float, getFloat() = 7;      --> numeric
# date, getDate() = 8;        --> date
# text, getString() = 9;      --> string
# time, getTime() = 10;       --> date
#增加的数据类型说明
TINYINT=3
SMALLINT=4
#SHORT=4
SHORT=5
INTEGER=5
INT=5
INT_UNSIGNED=6
BIGINT=6
LONG=6
REAL=7
NUMERIC=2
DECIMAL=2
FLOAT=2
DOUBLE=2
CHAR=0
TEXT=0
VARCHAR=0
LONGVARCHAR=9
BINARY=0
VARBINARY=0
LONGVARBINARY=9
BIT=1
BLOB=9
DATE=8
TIME=8
DATETIME=8
TIMESTAMP=8


# other options
CREATE_DOUBLE=DOUBLE
CREATE_STRING=TEXT
CREATE_INT=INT
CREATE_DATE=DATETIME
DateFormat=yyyy-MM-dd HH:mm:ss
checkUpperCaseNames=false
checkLowerCaseNames=false
checkForTable=true


# All the reserved keywords for this database
# Based on the keywords listed at the following URL (2009-04-13):
# http://dev.mysql.com/doc/mysqld-version-reference/en/mysqld-version-reference-reservedwords-5-0.html
Keywords=\
  ADD,\
  ALL,\
  ALTER,\
  ANALYZE,\
  AND,\
  AS,\
  ASC,\
  ASENSITIVE,\
  BEFORE,\
  BETWEEN,\
  BIGINT,\
  BINARY,\
  BLOB,\
  BOTH,\
  BY,\
  CALL,\
  CASCADE,\
  CASE,\
  CHANGE,\
  CHAR,\
  CHARACTER,\
  CHECK,\
  COLLATE,\
  COLUMN,\
  COLUMNS,\
  CONDITION,\
  CONNECTION,\
  CONSTRAINT,\
  CONTINUE,\
  CONVERT,\
  CREATE,\
  CROSS,\
  CURRENT_DATE,\
  CURRENT_TIME,\
  CURRENT_TIMESTAMP,\
  CURRENT_USER,\
  CURSOR,\
  DATABASE,\
  DATABASES,\
  DAY_HOUR,\
  DAY_MICROSECOND,\
  DAY_MINUTE,\
  DAY_SECOND,\
  DEC,\
  DECIMAL,\
  DECLARE,\
  DEFAULT,\
  DELAYED,\
  DELETE,\
  DESC,\
  DESCRIBE,\
  DETERMINISTIC,\
  DISTINCT,\
  DISTINCTROW,\
  DIV,\
  DOUBLE,\
  DROP,\
  DUAL,\
  EACH,\
  ELSE,\
  ELSEIF,\
  ENCLOSED,\
  ESCAPED,\
  EXISTS,\
  EXIT,\
  EXPLAIN,\
  FALSE,\
  FETCH,\
  FIELDS,\
  FLOAT,\
  FLOAT4,\
  FLOAT8,\
  FOR,\
  FORCE,\
  FOREIGN,\
  FROM,\
  FULLTEXT,\
  GOTO,\
  GRANT,\
  GROUP,\
  HAVING,\
  HIGH_PRIORITY,\
  HOUR_MICROSECOND,\
  HOUR_MINUTE,\
  HOUR_SECOND,\
  IF,\
  IGNORE,\
  IN,\
  INDEX,\
  INFILE,\
  INNER,\
  INOUT,\
  INSENSITIVE,\
  INSERT,\
  INT,\
  INT1,\
  INT2,\
  INT3,\
  INT4,\
  INT8,\
  INTEGER,\
  INTERVAL,\
  INTO,\
  IS,\
  ITERATE,\
  JOIN,\
  KEY,\
  KEYS,\
  KILL,\
  LABEL,\
  LEADING,\
  LEAVE,\
  LEFT,\
  LIKE,\
  LIMIT,\
  LINES,\
  LOAD,\
  LOCALTIME,\
  LOCALTIMESTAMP,\
  LOCK,\
  LONG,\
  LONGBLOB,\
  LONGTEXT,\
  LOOP,\
  LOW_PRIORITY,\
  MATCH,\
  MEDIUMBLOB,\
  MEDIUMINT,\
  MEDIUMTEXT,\
  MIDDLEINT,\
  MINUTE_MICROSECOND,\
  MINUTE_SECOND,\
  MOD,\
  MODIFIES,\
  NATURAL,\
  NOT,\
  NO_WRITE_TO_BINLOG,\
  NULL,\
  NUMERIC,\
  ON,\
  OPTIMIZE,\
  OPTION,\
  OPTIONALLY,\
  OR,\
  ORDER,\
  OUT,\
  OUTER,\
  OUTFILE,\
  PRECISION,\
  PRIMARY,\
  PRIVILEGES,\
  PROCEDURE,\
  PURGE,\
  READ,\
  READS,\
  REAL,\
  REFERENCES,\
  REGEXP,\
  RELEASE,\
  RENAME,\
  REPEAT,\
  REPLACE,\
  REQUIRE,\
  RESTRICT,\
  RETURN,\
  REVOKE,\
  RIGHT,\
  RLIKE,\
  SCHEMA,\
  SCHEMAS,\
  SECOND_MICROSECOND,\
  SELECT,\
  SENSITIVE,\
  SEPARATOR,\
  SET,\
  SHOW,\
  SMALLINT,\
  SONAME,\
  SPATIAL,\
  SPECIFIC,\
  SQL,\
  SQLEXCEPTION,\
  SQLSTATE,\
  SQLWARNING,\
  SQL_BIG_RESULT,\
  SQL_CALC_FOUND_ROWS,\
  SQL_SMALL_RESULT,\
  SSL,\
  STARTING,\
  STRAIGHT_JOIN,\
  TABLE,\
  TABLES,\
  TERMINATED,\
  THEN,\
  TINYBLOB,\
  TINYINT,\
  TINYTEXT,\
  TO,\
  TRAILING,\
  TRIGGER,\
  TRUE,\
  UNDO,\
  UNION,\
  UNIQUE,\
  UNLOCK,\
  UNSIGNED,\
  UPDATE,\
  UPGRADE,\
  USAGE,\
  USE,\
  USING,\
  UTC_DATE,\
  UTC_TIME,\
  UTC_TIMESTAMP,\
  VALUES,\
  VARBINARY,\
  VARCHAR,\
  VARCHARACTER,\
  VARYING,\
  WHEN,\
  WHERE,\
  WHILE,\
  WITH,\
  WRITE,\
  XOR,\
  YEAR_MONTH,\
  ZEROFILL


# The character to append to attribute names to avoid exceptions due to
# clashes between keywords and attribute names
KeywordsMaskChar=_


#flags for loading and saving instances using DatabaseLoader/Saver
nominalToStringLimit=50
idColumn=auto_generated_id


然后重命名DatabaseUtils.props,这是因为在客户端GUI界面连接数据库,软件会默认找这个文件,重新压缩jar包并替换之前的weka.jar包

4 weka环境的配置

在weka安装目录新建lib文件夹,并把数据库驱动放在里面
WEKA_HOME: D:\Program Files\Weka-3-8


  CLASSPATH: %WEKA_HOME%\lib\mysql-connector-java-5.1.37-bin.jar
,这一点最好是在java目录下也新建lib文件夹,并把数据库驱动放在里面,在CLASSPATH: %JAVA_HOME%\lib\mysql-connector-java-5.1.37-bin.jar,要不然会出现中找不到数据库驱动的尴尬。
在weka安装目录中找到RunWeka.ini,按照下图修改即可
weka连接jdbc数据库

5 weka的基本使用

打开weka-Explorer-openDBweka连接jdbc数据库
连接成功
weka连接jdbc数据库
下面可以数据挖掘分析
weka连接jdbc数据库
这是我第一次写博客,里面的不足之处,望大家批评,谢谢大家