• 企业搜索引擎开发之连接器connector(十四)


    回顾Context类的start方法,还有一部分是启动调度器的方法

    /**
       * Start up the Scheduler.
       */
      private void startScheduler() {
        traversalScheduler =
            (TraversalScheduler) getRequiredBean("TraversalScheduler",
                TraversalScheduler.class);
        if (traversalScheduler != null) {
          traversalScheduler.init();
        }
      }

    即执行TraversalScheduler类对象的init()方法

    TraversalScheduler是实现Runnable接口的类,实现该接口的多线程方法,其源码如下: 

    /**
     * Scheduler that schedules connector traversal.  This class is thread safe.
     * Must initialize TraversalScheduler before running it.
     *
     * <p> This facility includes a schedule thread that runs a loop.
     * Each iteration it asks the instantiator for the schedule
     * for each Connector Instance and runs batches for those that
     * are
     * <OL>
     * <LI> scheduled to run.
     * <LI> have not exhausted their quota for the current time interval.
     * <LI> are not currently running.
     * </OL>
     * The implementation must handle the situation that a Connector
     * Instance is running.
     */
    public class TraversalScheduler implements Runnable {
      public static final String SCHEDULER_CURRENT_TIME = "/Scheduler/currentTime";
    
      private static final Logger LOGGER =
        Logger.getLogger(TraversalScheduler.class.getName());
    
      private final Instantiator instantiator;
    
      private boolean isInitialized; // Protected by instance lock.
      private boolean isShutdown; // Protected by instance lock.
    
      /**
       * Create a scheduler object.
       *
       * @param instantiator used to get schedule for connector instances
       */
      public TraversalScheduler(Instantiator instantiator) {
        this.instantiator = instantiator;
        this.isInitialized = false;
        this.isShutdown = false;
      }
    
      public synchronized void init() {
        if (isInitialized) {
          return;
        }
        isInitialized = true;
        isShutdown = false;
        new Thread(this, "TraversalScheduler").start();
      }
    
      public synchronized void shutdown() {
        if (isShutdown) {
          return;
        }
        isInitialized = false;
        isShutdown = true;
      }
    
      /**
       * Determines whether scheduler should run.
       *
       * @return true if we are in a running state and scheduler should run or
       *         continue running.
       */
      private synchronized boolean isRunningState() {
        return isInitialized && !isShutdown;
      }
    
      private void scheduleBatches() {
        for (String connectorName : instantiator.getConnectorNames()) {
          NDC.pushAppend(connectorName);
          try {
            instantiator.getConnectorCoordinator(connectorName).startBatch();
          } catch (ConnectorNotFoundException e) {
            // Looks like the connector just got deleted.  Don't schedule it.
          } finally {
            NDC.pop();
          }
        }
      }
    
      public void run() {
        NDC.push("Traverse");
        try {
          while (true) {
            try {
              if (!isRunningState()) {
                LOGGER.info("TraversalScheduler thread is stopping due to "
                    + "shutdown or not being initialized.");
                return;
              }
              scheduleBatches();
              // Give someone else a chance to run.
              try {
                synchronized (this) {
                  wait(1000);
                }
              } catch (InterruptedException e) {
                // May have been interrupted for shutdown.
              }
            } catch (Throwable t) {
              LOGGER.log(Level.SEVERE,
                  "TraversalScheduler caught unexpected Throwable: ", t);
            }
          }
        } finally {
          NDC.remove();
        }
      }
    }

    TraversalScheduler类依赖于Instantiator类,用于Instantiator遍历所有连接器的ConnectorCoordinatorImpl对象并启用startBatch()方法

    多线程实现方法run()里面是一个死循环,不断的轮询执行scheduleBatches()方法

    我们回顾前面的ConnectorCoordinatorImpl类的startBatch()方法

    //@Override
      public synchronized boolean startBatch() throws ConnectorNotFoundException {
        verifyConnectorInstanceAvailable();
        if (!shouldRun()) {
          return false;
        }
    
        BatchSize batchSize = loadManager.determineBatchSize();
        if (batchSize.getMaximum() == 0) {
          return false;
        }
        taskHandle = null;
        currentBatchKey = new Object();
    
        try {
          BatchCoordinator batchCoordinator = new BatchCoordinator(this);
          TraversalManager traversalManager =
              getConnectorInterfaces().getTraversalManager();
          Traverser traverser = new QueryTraverser(pusherFactory,
              traversalManager, batchCoordinator, name,
              Context.getInstance().getTraversalContext());
          TimedCancelable batch =  new CancelableBatch(traverser, name,
              batchCoordinator, batchCoordinator, batchSize);
          taskHandle = threadPool.submit(batch);
          return true;
        } catch (ConnectorNotFoundException cnfe) {
          LOGGER.log(Level.WARNING, "Connector not found - this is normal if you "
              + " recently reconfigured your connector instance: " + cnfe);
        } catch (InstantiatorException ie) {
          LOGGER.log(Level.WARNING,
              "Failed to perform connector content traversal.", ie);
          delayTraversal(TraversalDelayPolicy.ERROR);
        }
        return false;
      }

     方法首先执行!shouldRun()的判断,我们分析一下该方法的源码:

    /**
       * Returns {@code true} if it is OK to start a traversal,
       * {@code false} otherwise.
       */
      // Package access because this is called by tests.
      synchronized boolean shouldRun() {
        // Are we already running? If so, we shouldn't run again.
        if (taskHandle != null && !taskHandle.isDone()) {
          return false;
        }
    
        // Don't run if we have postponed traversals.
        if (System.currentTimeMillis() < traversalDelayEnd) {
          return false;
        }
    
        Schedule schedule = getSchedule();
    
        // Don't run if traversals are disabled.
        if (schedule.isDisabled()) {
          return false;
        }
    
        // Don't run if we have exceeded our configured host load.
        if (loadManager.shouldDelay()) {
          return false;
        }
    
        // OK to run if we are within one of the Schedule's traversal intervals.
        Calendar now = Calendar.getInstance();
        int hour = now.get(Calendar.HOUR_OF_DAY);
        for (ScheduleTimeInterval interval : schedule.getTimeIntervals()) {
          int startHour = interval.getStartTime().getHour();
          int endHour = interval.getEndTime().getHour();
          if (0 == endHour) {
            endHour = 24;
          }
          if (endHour < startHour) {
            // The traversal interval straddles midnight.
            if ((hour >= startHour) || (hour < endHour)) {
              return true;
            }
          } else {
            // The traversal interval falls wholly within the day.
            if ((hour >= startHour) && (hour < endHour)) {
              return true;
            }
          }
        }
    
        return false;
      }

    该方法对是否调度连接器做出审查,如上轮调度是否完成、调度设置是否可用、加载管理器是否要求延迟、调度时机是否成熟等

    ---------------------------------------------------------------------------

    本系列企业搜索引擎开发之连接器connector系本人原创

    转载请注明出处 博客园 刺猬的温驯

    本文链接 http://www.cnblogs.com/chenying99/archive/2013/03/20/2970378.html

  • 相关阅读:
    JAVA框架 Spring 事务
    JAVA框架 Spring 调用jdbcsuport简化开发
    JAVA框架 Spring JDBC模板
    JAVA框架 Spring AOP注解
    JAVA框架 Spring AOP--切入点表达式和通知类型
    JAVA框架 Spring AOP底层原理
    JAVA框架 Spring junit整合单元测试
    JAVA框架 Spring 注解注入
    JAVA框架 Spring 引入多个配置文件
    django序列化
  • 原文地址:https://www.cnblogs.com/chenying99/p/2970378.html
Copyright © 2020-2023  润新知