联系客服
客服二维码

联系客服获取更多资料

微信号:LingLab1

客服电话:010-82185409

意见反馈
关注我们
关注公众号

关注公众号

linglab语言实验室

回到顶部
【Elasticsearch源码解析】ES源码编译

645 阅读 2020-07-24 09:37:02 上传

以下文章来源于 西语语言学工作坊


本文是elasticsearch源码分析系列文档的第一篇,因为之前一直从事搜索广告的工作,也一直想写一下关于lucene和es的文章,现在终于有机会了。本篇作为开篇,简单介绍了elasticsearch源码在本机的编译环境搭建,为后面源码分析提供基础。

用到的工具有:IntelliJ IdeaJDK1.8gradle4.3elasticsearch-6.0.1 的发行版(es官网通常都是下载最新版,从这里可以下载需要的历史版本 https://www.elastic.co/cn/downloads/past-releases#elasticsearch

下载源码


elasticsearch的源码是托管在github的,但是github下载很慢,所以选择了从gitee下载,地址如下:https://gitee.com/mirrors/elasticsearch
下载elasticsearch的主干版本,在7.x的版本了,目前项目中很少有用到这么新的版本的。通常都还停留在6.x甚至5.x。我们选择6.0.1的版本。
git checkout v6.0.1

因为elasticsearch在5.X版本后默认的编译工具从Maven换到了Gradle。需要在本机配置好Gradle环境,如有必要可以切换国内的源。本机使用Gradle4.3和JDK1.8编译elasticsearch。(因为elasticsearch 6.0.1的版本要求gradle的版本必须在4.3或以上,否则编译会报错)

在elasticsearch中有个类记录当前的版本,即org.elasticsearch.Version,如下:

public class Version implements Comparable<Version> {
/*
    * The logic for ID is: XXYYZZAA, where XX is major version, YY is minor version, ZZ is revision, and AA is alpha/beta/rc indicator AA
    * values below 25 are for alpha builder (since 5.0), and above 25 and below 50 are beta builds, and below 99 are RC builds, with 99
    * indicating a release the (internal) format of the id is there so we can easily do after/before checks on the id
    */

public static final Version V_6_0_1 =
new Version(V_6_0_1_ID, org.apache.lucene.util.Version.LUCENE_7_0_1);
public static final Version CURRENT = V_6_0_1;

可以看到当前es的版本的6.0.1,使用的lucene的版本是7.0.1

Gradle编译


我们使用idea查看源码,但是直接导入idea会有很多包找不到,并且idea的加载的速度会非常慢,所以我们采用gradle命令行的方式编译。因为我们使用idea作为IDE,所以命令是gradle idea(如果IDE是eclipse,那命令就是gradle eclipse)。编译的时间很长,我在本机用了10min(macbook pro 2015),成功编译的话会有提示build successful,如下图:

导入IDEA


源码编译成功后,启动IDE(可以直接在当前目前执行idea .),虽然已经用gradle编译过了,但是idea还是会进行很长时间的配置和加载,等待完全加载完毕后,查看项目如下图,基本能看到项目的各个模块:


尝试启动


elasticsearch的启动类是org.elasticsearch.bootstrap.Elasticsearch,该类的main方法是启动入口,如下图:


同样,在这个package下还有一个类org.elasticsearch.bootstrap.Bootstrap,很容易引起误解这个才是启动类。看过代码之后发现Elasticsearch的启动类会调用Bootstrap的方法做初始化,如下:


启动当然需要一些配置,不然是会启动报错的。

  • 第一个报错

ERROR: the system property [es.path.conf] must be set

es.path.conf是配置文件的路径,如果路径下没有配置文件,将使用默认配置,可将elasticsearch.yml文件放在此路径下。
解决方法:我这里直接将该路径指向了下载好的elasticsearch-6.0.1 的发行版路径,如下
-Des.path.conf=/xxx/soft/elasticsearch-6.0.1

  • 第二个报错

Exception in thread "main" java.lang.IllegalStateException: path.home is not configured
   at org.elasticsearch.env.Environment.(Environment.java:101)
   at org.elasticsearch.node.InternalSettingsPreparer.prepareEnvironment(InternalSettingsPreparer.java:85)
   at org.elasticsearch.cli.EnvironmentAwareCommand.createEnv(EnvironmentAwareCommand.java:78)
   at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:69)
   at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134)
   at org.elasticsearch.cli.Command.main(Command.java:90)
   at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92)
   at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:85)

这是es运行的home路径。plugins,modules,lib等都会在此路径下相应的子路径加载。如果不指定,默认的data,log,config路径也会创建在此home路径下。但其实因为代码的问题config路径必须指定,否则会报错。
解决方法:同样将这个路径指向了下载好的elasticsearch-6.0.1 的发行版路径,如下:
-Des.path.home=/xxx/soft/elasticsearch-6.0.1

  • 第三个报错

Exception in thread "main" 2020-05-27 17:41:03,055 main ERROR No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
2020-05-27 17:41:03,164 main ERROR Could not register mbeans java.security.AccessControlException: access denied ("javax.management.MBeanTrustPermission" "register")
   at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
   at java.lang.SecurityManager.checkPermission(SecurityManager.java:585)
   at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.checkMBeanTrustPermission(DefaultMBeanServerInterceptor.java:1848)
   at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:322)
   at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
   at org.apache.logging.log4j.core.jmx.Server.register(Server.java:389)
   at org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:167)
   at org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:140)
   at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:556)
   at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:617)
   at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:634)
   at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:229)
   at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:242)
   at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45)
   at org.apache.logging.log4j.LogManager.getContext(LogManager.java:174)
   at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:648)
   at org.elasticsearch.common.logging.ESLoggerFactory.getLogger(ESLoggerFactory.java:54)
   at org.elasticsearch.common.logging.ESLoggerFactory.getLogger(ESLoggerFactory.java:62)
   at org.elasticsearch.common.logging.Loggers.getLogger(Loggers.java:101)
   at org.elasticsearch.ExceptionsHelper.<clinit>(ExceptionsHelper.java:42)
   at org.elasticsearch.ElasticsearchException.toString(ElasticsearchException.java:663)
   at java.lang.String.valueOf(String.java:2994)
   at java.io.PrintStream.println(PrintStream.java:821)
   at java.lang.Throwable$WrappedPrintStream.println(Throwable.java:748)
   at java.lang.Throwable.printStackTrace(Throwable.java:655)
   at java.lang.Throwable.printStackTrace(Throwable.java:643)
   at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:1061)
   at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:1052)
   at java.lang.Thread.dispatchUncaughtException(Thread.java:1959)

解决方法:添加启动配置,-Dlog4j2.disable.jmx=true
由于对java security不太了解,这是网上找到的解决方案。

  • 第四个报错

[2020-05-29T11:28:37,580][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: jar hell!
class: jdk.packager.services.UserJvmOptionsService
jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/lib/ant-javafx.jar
jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/lib/packager.jar
   at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:134) ~[main/:?]
   at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:121) ~[main/:?]
   at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:69) ~[main/:?]
   at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134) ~[main/:?]
   at org.elasticsearch.cli.Command.main(Command.java:90) ~[main/:?]
   at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) ~[main/:?]
   at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:85) ~[main/:?]
Caused by: java.lang.IllegalStateException: jar hell!
class: jdk.packager.services.UserJvmOptionsService
jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/lib/ant-javafx.jar
jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/lib/packager.jar
   at org.elasticsearch.bootstrap.JarHell.checkClass(JarHell.java:282) ~[main/:?]
   at org.elasticsearch.bootstrap.JarHell.checkJarHell(JarHell.java:192) ~[main/:?]
   at org.elasticsearch.bootstrap.JarHell.checkJarHell(JarHell.java:90) ~[main/:?]
   at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:197) ~[main/:?]
   at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:322) ~[main/:?]
   at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:130) ~[main/:?]
... 6 more

应该是jdk本身类冲突的问题,jdk.packager.services.UserJvmOptionsService这个类有2个。应该升级jdk的版本能解。
解决方法:这里我选择了一个偷懒的方案,直接把JarHell.checkClass这个方法中抛出异常的部分注掉了,后面如果有其他问题再考虑升级jdk。

  • 第五个报错

[2020-05-29T11:35:28,193][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.StringIndexOutOfBoundsException: String index out of range: -1
   at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:134) ~[main/:?]
   at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:121) ~[main/:?]
   at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:69) ~[main/:?]
   at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134) ~[main/:?]
   at org.elasticsearch.cli.Command.main(Command.java:90) ~[main/:?]
   at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) ~[main/:?]
   at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:85) ~[main/:?]
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -1
   at java.lang.String.substring(String.java:1967) ~[?:1.8.0_151]
   at org.elasticsearch.bootstrap.Security.readPolicy(Security.java:222) ~[main/:?]
   at org.elasticsearch.bootstrap.Security.getPluginPermissions(Security.java:179) ~[main/:?]
   at org.elasticsearch.bootstrap.Security.configure(Security.java:121) ~[main/:?]
   at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:207) ~[main/:?]
   at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:322) ~[main/:?]
   at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:130) ~[main/:?]

这里是因为,默认情况下build会采用snapshot的版本,而我们的发行版都是release的,所以在security加载module时有文件加载不到,解决方案有2种。
解决方法1:一个偷懒的方案,直接把org.elasticsearch.bootstrap.Security#readPolicy这个方法中判断 snapshot的地方注掉,如图:

image.png


解决方法2:因为我们下载的es是正式发行版,所以在命名上是不带snapshot的,所以我们将发行版本中的elasticsearch-6.0.1/modules/reindex/elasticsearch-rest-client-6.0.1.jar这个文件,把这个jar包的名称改成snapshot版本(elasticsearch-rest-client-6.0.1-SNAPSHOT.jar),就能正常启动了。


成功运行

经过九九八十一难之后终于启动运行后,从日志可以看到启动之后默认是监听9200端口。后面可以使用127.0.0.1:9200这个端口来测试啦,后面就能进行代码的调试拉。
启动成功的日志如下:

[2020-05-29T14:07:15,138][INFO ][i.n.u.i.PlatformDependent] Your platform does not provide complete low-level API for accessing direct buffers reliably. Unless explicitly requested, heap buffer will always be preferred to avoid potential system instability.
[2020-05-29T14:07:15,322][INFO ][o.e.t.TransportService   ] [80GXGgY] publish_address {127.0.0.1:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}
[2020-05-29T14:07:15,339][WARN ][o.e.b.BootstrapChecks    ] [80GXGgY] initial heap size [268435456] not equal to maximum heap size [4294967296]; this can cause resize pauses and prevents mlockall from locking the entire heap
[2020-05-29T14:07:18,414][INFO ][o.e.c.s.MasterService    ] [80GXGgY] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {80GXGgY}{80GXGgYFTRaO0GqmHEG_0Q}{-R2fH2CmSuiKPLR5DOxJfA}{127.0.0.1}{127.0.0.1:9300}
[2020-05-29T14:07:18,421][INFO ][o.e.c.s.ClusterApplierService] [80GXGgY] new_master {80GXGgY}{80GXGgYFTRaO0GqmHEG_0Q}{-R2fH2CmSuiKPLR5DOxJfA}{127.0.0.1}{127.0.0.1:9300}, reason: apply cluster state (from master [master {80GXGgY}{80GXGgYFTRaO0GqmHEG_0Q}{-R2fH2CmSuiKPLR5DOxJfA}{127.0.0.1}{127.0.0.1:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2020-05-29T14:07:18,440][INFO ][o.e.h.n.Netty4HttpServerTransport] [80GXGgY] publish_address {127.0.0.1:9200}, bound_addresses {[fe80::1]:9200}, {[::1]:9200}, {127.0.0.1:9200}
[2020-05-29T14:07:18,440][INFO ][o.e.n.Node               ] [80GXGgY] started
[2020-05-29T14:07:18,444][INFO ][o.e.g.GatewayService     ] [80GXGgY] recovered [0] indices into cluster_state
点赞
收藏
表情
图片
附件