

シフトスケジューリング問題 (入門)



\begin{align} minimize \quad &f(x): &目的関数\\ subject \; to \\ &g(x)>= 0: &不等式制約条件\\ &h(x) = 0: &等式制約条件\\ &a <= x <= b: &変数\\ \end{align}



それではこの問題を徐々にモデル化および定式化していきましょう。メンバー mが曜日 dにアクション aを担当するときに 1、そうでないときに 0となる変数 x_{mda}を使います。夫婦がどちらも都合が悪い場合に奥様の母上がヘルプに入ってくれてとても助かっているのですが、それに甘えすぎると大切な子どもとの時間を仕事などに割り振ってしまう傾向が強まります。そこで目的関数として母上のヘルプ回数を設定し、それを最小化するという問題設定にします。

minimize  \sum_{d \in D, \, a \in A} x_{母da} \qquad D=\{Mon, Tue, Wed, Thu, Fri\}, \, A=\{朝, お迎え\}



import pulp

Member = ["夫", "妻", "母"] 
Day = ["Mon", "Tue", "Wed", "Thu", "Fri"] 
Action = ["朝", "お迎え"] 

ShiftScheduling = pulp.LpProblem("ShiftScheduling", pulp.LpMinimize)

x = {}
for m in Member:
    for d in Day:
        for a in Action:
            x[m, d, a] = pulp.LpVariable("x({:},{:},{:})".format(m,d,a), 0, 1, pulp.LpInteger)

ShiftScheduling += pulp.lpSum(x['母', d, a] for d in Day for a in Action), "Target"



 \displaystyle  \sum_{m \in M} x_{mda}=1 \qquad M=\{夫、妻、母\}, \, d \in D, \, a \in A


 \displaystyle   x_{mda}=0 \qquad (m,d,a) \in NG, \quad NG=\{(m, d, a) | メンバーmは曜日dにアクションaが不可能\}


# それぞれの日の朝やお迎えに必ず誰か1人が割り当てられる
for d in Day:
    for a in Action:
        ShiftScheduling += pulp.lpSum(x[m, d, a] for m in Member) == 1, "Constraint_{:}_{:}".format(d,a)

# 対応不可能日
NG = [['夫','Tue','朝'],['妻','Wed','お迎え'],['夫','Thu','お迎え'],['妻','Thu','お迎え'],['妻','Fri','朝']]
for m, d, a in NG:
    ShiftScheduling += x[m, d, a] == 0, "Constraint_NG_{:}_{:}_{:}".format(m,d,a)


では一旦この状態で最適化問題を解いてみましょう。 solve()関数を呼ぶだけです。その結果作成されたシフトも示します。

results = ShiftScheduling.solve()
print("optimality = {:}, target value = {:}".format(pulp.LpStatus[results], pulp.value(ShiftScheduling.objective)))

 optimality = Optimal, \quad target \, value = 1.0





# 週全体で出動回数の差を抑える
n_act_wife = pulp.lpSum(x['妻', d, a] for d in Day for a in Action)
n_act_husband = pulp.lpSum(x['夫',d, a] for d in Day for a in Action)
ShiftScheduling += n_act_wife - n_act_husband <= 1, "Constraint_fairness_wife"
ShiftScheduling += n_act_husband - n_act_wife <= 1, "Constraint_fairness_husband"



# 1日で朝かお迎えどちらかだけ
for m in Member:
    for d in Day:
        ShiftScheduling += pulp.lpSum(x[m, d, a] for a in Action) <= 1, "Constraint_Either_{:}_{:}".format(m,d)





AI Stock Price Prediction Methods.

By Matthew Millar R&D Scientist at ユニファ


            This blog is looking at finding the best method for future price prediction for stocks in general.  This blog will look at methods for calculating production that is in current research and uses in the industry.  Also, it will cover; statistical methods, machine learning, artificial neural networks, and hybrid models.  From the current research, hybrid models were found to give the best results over pure statistical methods and pure machine learning and artificial neural network methodologies.


            Stock market prediction is a method to discover possible future values of a stock.  With successful predictions of prices, higher profits can be gained, but on the other hand with bad predictions can produce losses.  There is a standard thought that the market does not follow a standard flow, so predictions using statistics or models would be an impossible task due to the idea that technical factors cannot show all of the variables that help shape the movement of the stock market.  On the contrary, the efficient market hypothesis would otherwise suggest, a stock price already reflects all the information that could affect any price changes except for unknown information which is always a level of unpredictability.

            Fundamental analysis looks at the company rather than just the numbers or charts.  This involves looking at the past performance and credibility of its accounts.  This approach is mainly used by fund managers as this is one of the traditional methods that use publicly available information on the company.  Technical analysis does not care about the company's fundamentals but mainly look at the trends of the stocks past performance in a time series analysis.  These are the two more common ways of stock price prediction.  Current trends in price forecasting are the use of Data mining technologies.  The use of Artificial Neural Networks (ANN), Machine Learning (ML) algorithms, and statistical models are now being used to help in prediction.       

            Due to the increasing amount of data from trades, data mining and data analysis on this amount of data can be very difficult using standard methods.  With the amount of data that is produced daily, this could be considered on the level of Big Data analytics.  Due to the ever-changing values of the data throughout the day, it is difficult to monitor every stock that is available, let alone to perform prediction on price movements.  This is where the use of algorithms and models come into play.  With these algorithmic models, predictions can be made with relative accuracy and can give investors better insight into the actual data's value without all of the excessive amounts of data review.


            The goal of this blog is to look at possible methods for prediction using ML, ANN, and statistical models.  Statistical methods are the most commonly used method for price prediction currently, but will the use of ML, ANN, or hybrid systems can give a more accurate prediction. It will continue to look at statistical methods, pure ML, pure ANN, and hybrid models, a more comprehensive list can be made, and a better idea of which approach could be better for forecasting purposes.   

Methodology Review:

            There are a few emerging methods that are gaining popularity and applications over the traditional financial approaches.  These methods are based on ML, statistical models, ANN, or a hybrid model.  Some of the proposed models use a purely statistical method, some use pure ML or ANN, and others use a hybrid approach combining both statistical and ML or ANN together.  Each model has its pros and cons for use and a problem that solves best.

Statistical Methods:

            By using data analysis, one can predict the closing price of a certain stock.  Currently, there are six common methods or data analysis to make a predictive model.  Some of these models are common Stock price models that are used currently in the stock market and by fund managers.  These five models are must be in agreement of the movement direction in order for the price movement to be predicted most accurately.  These models are; Typical Price (TP), Chaikin Money Flow Indicator (CMI), Stochastic Momentum Index (SMI),

Relative Strength Index (RSI), Bollinger Bands (BB), Moving Average(MA), and Bollinger Signal.  By combining these algorithms, a more accurate prediction can be made by looking at upper and lower bands, if the price goes above the upper band then that indicates a positive selling point and if it goes below the lower band it indicates a positive buy point.  This method of combining the results of other models does give a better chance of price change than just one of the single statistical models (Kannan, Sekar, Sathik, & Arumugam, 2010).

Machine Learning Artificial Neural Network Methods:            

            Text mining has also been used in stock prices trends, especially for inter day prices trends.  By using text mining techniques, a 46% chance of knowing if a stock will increase or decrease by 0.5% or remain in the positive and negative range, which was more significant than a random predictor which only gave around 33% accuracy for stock price fluctuation prediction.  By using a process of text mining, by gathering press releases and preprocess them into usable data and categorizing them into different news types.  Trading rules can then be derived from this data for particular stocks (Mittermayer, 2004).

            Pure ANN is used currently for stock prediction as well as analysis.  The users have given very reliable results as ANN are good at working with errors, can use large and complex data, and can produce useful prediction results.  For forecasting just one stock, there is a lot of interacting input series that is needed.  Each neuron can represent a decision process.  This will allow for ANN to represent the interaction between the decisions of everyone in the market.  This will allow for the ANN to completely model a market.  ANN is very effective at predicting stock prices (Kimoto, Asakawa, Yoda, & Takeoka, 1990; Li, & Ma, 2010),

            ANN is gaining acceptance in the financial sector.  There are many techniques and application that look into using AI in creating prediction models.  One common method is to use a genetic algorithm (GA) to aid in the training or the ANN for the prediction of stock prices.  The GA, in most cases, are used for training the network, selecting the feature subset, and aiding in topology optimization.  A GA can be used to help in the feature discretionary and determination of connection weights for an ANN (Kim and Ham, 2000).

Hybrid Models:

            ML combined with an AI is a very good combination of two very powerful methodologies.  An ML model can be used for data mining to define and predict the relationships between both financial and economic variables.  The examination of the level estimation and classification can be used for a prediction of future values.  Multiple studies show that by using a classification model, a trading strategy can generate higher risk-adjusted profits than the traditional buy and hold strategy as well as the level estimation prediction capability of an ANN or linear regression (Enke and Thawornwong, 2005).  ML is mainly based on supervised learning which is not appropriate for long term goals.  But, by using reinforcement learning ML, is more suitable for modeling real-world situations much like stock price prediction.  By looking at stock price prediction as a Markov process, ML with the TD(0) reinforcement learning algorithm that focuses on learning from experiences which are combined with an ANN is taught the states of each stock price trend at given times.  There is an issue with this in that if the prediction period is either very short or very long, the accuracy decays drastically.  So this would only be useful for mid-range prediction for prices (Lee, 2001).  Another Hybrid model that has given great accuracy (around 77%) is combining a decision tree and an ANN together.  By using the decision tree to create the rules for the forecasting decision and the ANN to generate the prediction.  This combination is more accurate than either an ANN or a decision tree separately (Tsai and Wang, 2009).

Other Useful Models:

            Support Vector Machines have also been used for stock market prediction.  SVM does perform better than random guessing but are out shown by hybrid models.  A combination of a genetic algorithm and an SVM can produce a better result than even an SVM alone.  By using some technical analysis fields used as input features.  This method was used to not only produce a forecast for the stock that is being looked at as well as any other stock that has a correlation between each other and the target stock.  This hybrid significantly can outperform a standalone SVM (Choudhry and Garg, 2008). 


            There are many ways to produce a future value prediction, though some are slightly better and more accurate than others.  Statistical analysis is the most common approach to prediction.  Linear regression, logistic regression, time series models, etc... are some of the more common ways of predicting future values.  But, these methods may not be the best for more complex and dynamic data sets.  If the data is not the same type, linear regression may have poor results.  This is where an ANN or ML model comes in.  These can produce a better result which a higher accuracy than a purely statistical approach as they can work with the complex systems of a market.  In the pure form, an ANN or ML can produce better accuracy over many statistical methods for most stock price predictions.  But, by using hybrid ANN, an even more accurate and useful model can be done.  By combining a DT or a GA with an ANN, a greater accuracy over the two pure methods can be gained.   






Reinforcement Learning the Hello World Version

By Matthew Millar R&D Scientist at ユニファ

What is Reinforcement Learning?

Reinforcement learning (RI) is a subset of Artificial Intelligence (AI) that is an agent-based approach in the field of AI. Agent-based AI learns from interacting with their environment rather than a dataset. The environment is controlled or altered by the actions of the agent. The environment changes should have goals linked to them so that the agent knows the results of each action. From this goal, appropriate actions can be learned for many different types of tasks. RI is one of the three learning techniques in machine learning. The other two are supervised and unsupervised learning. RI does not require labeled data as in supervised or semi-supervised learning.

In general, RI is a Markov Decision Process (MDP) which takes both the environment and agent states denoted a  S. It then lists out a set of all possible actions available to an agent denoted as A. There is a probability equation that shows the likely hood of transitions between states   (s1 -> s2) when a certain action  (a) is performed. The requirement of a reward is also needed to drive the system forward to learn. The normal use is an immediate reward after the transition between states denoted as  Ra(s, sl). There also must be a set of rules that control what the agent receives as feedback, or what it can observe.


The interaction between an agent and its environment happens in discrete time steps. Each time t step will have an observation (ot ) and a reward (rt), The agent will then chose an action from its predefined set of actions (at). This action is then sent into the environment which will move the environment to its next state (st+1). As the environment moves to its next state, a new reward is then generated (rt+1). This state change can then create a transition (st, at, st+1) which allows for the agent to learn from its actions.

The agent will have to learn what is optimal behavior which leads to teaching it to regret its actions. Regret compares the performance of the current agent to that of an “optimal” agent and looks at the differences between both the actions and the awarded rewards. This is where instant rewards can hinder the performance of a model. The Agent will have to consider the long-term outcomes of its actions. This means even if the instant rewards for a certain action may be negative, the total long-term rewards could be higher. This is all to maximize future outcomes and rewards. This allows for RI to be applied to both short term goals as well as long term solutions, but not at the same time normally.

For an agent to increase the total rewards for a task, a balance between exploration and exploitation needs to be considered. Exploitation is basically using what works. This means that the agent will continue to perform actions that give it the best rewards but may not be optimal or can max out the rewards for a given set of actions. This is where exploration comes in to manage this and to hopefully optimize the result by trying different actions at any given time step. In general, simple exploration models are preferred as there are very few exploration algorithms that can scale up with a very complex environment and action sets. One way is to choose the best long-term action and then randomly choose an action at different times to see if these actions can improve on the already defined “optimal” long-term option.

There are many options to control the learning process for RI agents. The idea of how to define what actions a beneficial is still difficult to deal with correctly. The simplest method if the action set is finite is to use a policy. A policy maps the probability of action to a certain change in the state of the environment. These policies are the base of all other learning algorithms. The state-value function uses the policies by looking at the expected return of each action from the initial state and successfully follow the policy. This method evaluates the benefit of being in certain states over others. Monte Cario (MC) method can be used to mimic a policy evaluation and improvement. MC evaluation step looks at every state and action pair and averages all the returns from each state and action pair. This can compute a very accurate Q-table which can fully evaluate the policy. The improvement step uses a greedy policy and looks at each state-action pair and then computes an action that will maximize the return for each state. These are just two methods to control the learning of an RI agent.

Simple Example:

This was a very quick overview of the main principles of RI. Now let’s get to our first project.
So, let’s start with the hello world of OpenAI Gym. Note that this is just a first step on our way to the final project that I have planned.
The first step is to get all the dependencies.

Tensorflow: pip install --upgrade tensorflow
Keras: pip install Keras
Numpy: pip install numpy
OpenAI Gym: pip install gym.

Now that we have all the requirements lets get down to the nuts and bolts.
Let’s define our rewards method

Calculate Rewards (reward, gamma value)
		The new reward is reward + (previous reward * gamma value) 

This method lets us calculate the reward for every action.
Now let use a custom log loss function as well

Custom Loss (True Value, Predicted Value)
		Log(True * (true – Predicted) + (1 – ture) * (true + predicted)
		Take the mean of the results of this

Now a scoring function to see how accurate out AI is for testing.

Evaluate Model (Model, number of tests)
		Model score = 0
		Total reward = [] # Holds all rewards for each iteration
		For each iteration or run though the environment
			Sum of rewards
			Have the agent run though the environment collecting rewards
			Add the collected rewards to the Sum of rewards
			After the iteration add the final rewards 
		After each run through to add the final rewards to the total reward list
		Then you can find the mean, median, or the mode score or any other method you want

Now the keras model part

# Set up Keras model
action_count = env.action_space.n
input_shape = layers.Input(shape=env.reset().shape)
adv = layers.Input(shape=[1])

x = layers.Dense(8,activation="relu",
out = layers.Dense(action_count,activation="softmax",

# Crate the training model and prediction model
train_model = Model(inputs=[input_shape, adv], outputs=out)
train_model.compile(loss=my_ri_loss, optimizer=Adam(lr))
test_model = Model(inputs=[input_shape], outputs=out)

This will allow us to then try out the test environment that is in OpenAI gym.
Let us start with

env = gym.make("CartPole-v0")

Putting it all together should give you this.

With the output of

Average reward for training episode 100: 19.30 Test Score: 13.60 Loss: 0.030333 
Average reward for training episode 200: 22.45 Test Score: 10.50 Loss: 0.022194 
The average reward for training episode 8200: 25.68 Test Score: 191.40 Loss: -0.003017 
Solved in 8199 episodes!


This is the end of this necessary explanation of RI. We learned the basics of reinforcement learning and how to code this up using a simple example in python. Now we can start to get into more complex models and even latter teach something to run!

Part two so soon!?!?

Well yes as this was a simple version and easy to follow. Let’s look at using a retro game and start playing stuff like Space Invaders and Mario!
The first thing we need to do is install Game-Retro using
pip3 install gym-retro
And that’s it, so to test this out we will use the built in rom from Airstriker-Genesis, but other roms can be downloaded and used here as well.

import retro
def main():
    env = retro.make(game='Airstriker-Genesis')
    obs = env.reset()
    while True:
        obs, rew, done, info = env.step(env.action_space.sample())
        if done:
            obs = env.reset()
if __name__ == "__main__":

And that will give you the below screen output.


Now that we know that it is working, we need to import other games into the environment. The simplest way is to import all the ROMS that came with Gym-Retro by going into the terminal and typing this

python3 -m retro.import /path/to/your/ROMs/directory/

But you will need to find ROMs that work with gym-retro.

So our next steps that we will go after is to start with Unity3D and see how to implement RI in a 3D environment over these simpler 2D games. Soon we will look at teaching an AI to walk in the future.





















 海苔や小ねぎ、しょうがにも飽きましたか? いいえ、わさびがあります。茹でた小麦粉をめんつゆで食べましょう。

















頂点の種類 やること
状態 なにもしない(始点以外には選ばれない)
行動 始点でない場合、探索を終了する
条件 環境を指定の条件で検査し、trueかfalseによってそれぞれ別の頂点へ移動する
選択 その目標にターゲット(モノ、場所、時間)を設定する














範囲 やること
上位10件 そのまま残す
次の20件 ランダムな2件ずつでペアを作る
次の5件 全ての辺を10%の確率で消す
そのほか 削除する











  • 午後3時から午後4時
  • 午後5時から午後6時


  • 午前10時から午後4時
  • 午後3時から午後5時


時間帯を表記しやすいために、[15:00, 16:00] は午後3時から午後4時の意味をします。なお、[15:00, 16:00][16:00, 17:00] は被らないとします。



2つの時間帯の関係は全部6ケースあります。以下金色の時間帯1は[S1, E1]、水色の時間帯2は[S2, E2]とします。

  1. 被らない、E1 <= S2
  2. 被る、S1 < S2 < E1
  3. 被る、S2 <= S1 < E1 <= E2
  4. 被る、S1 <= S2 < E2 <= E1
  5. 被る、S1 < E2 < E1
  6. 被らない、E2 <= S1



  ALL CASES - <CASE 1> - <CASE 6>
=!<CASE 1>   && !<CASE 6>
=!(E1 <= S2) && !(E2 <= S1)
= (E1 > S2)  &&  (E2 > S1)








長方形1: (x1, y1), (x2, y2)
長方形2: (x3, y3), (x4, y4)
[x1, x2] と [x3, x4] 被る -> (x2 > x3) && (x4 > x1)
[y1, y2] と [y3, y4] 被る -> (y2 > y3) && (y4 > y1)
(x2 > x3) && (x4 > x1) && (y2 > y3) && (y4 > y1)



長方形1: (x1, y1), (x2, y2)
長方形2: (x3, y3), (x4, y4)
w1 = (x2 - x1), h1 = (y2 - y1)
w2 = (x4 - x3), h2 = (y4 - y3)
中心1 ((x2 + x1) / 2, (y2 + y1) / 2)
中心2 ((x4 + x3) / 2, (y4 + y3) / 2)
中心の距離: (
    ABS((x4 + x3) / 2 - (x2 + x1) / 2), 
    ABS((y4 + y3) / 2 - (y2 + y1) / 2)
ABS((x4 + x3) / 2 - (x2 + x1) / 2) < (w1 + w2) / 2 && 
    ABS((y4 + y3) / 2 - (y2 + y1) / 2)) < (h1 + h2) / 2
ABS((x4 + x3) - (x2 + x1)) < (w1 + w2) && 
    ABS((y4 + y3) - (y2 + y1))) < (h1 + h2)
ABS((x4 + x3) - (x2 + x1)) < ((x2 - x1) + (x4 - x3)) && 
    ABS((y4 + y3) - (y2 + y1))) < ((y2 - y1) + (y4 - y3))






滑り込みで東京オリンピックの抽選申し込みを済ませました、Webエンジニアの本間です。 どの日でもよいので抽選に当たって欲しいのですが、全部当たると困ってしまう、なかなか悩ましい気持ちになりました。

さて弊社では、現在開発中のプロジェクトのフロントエンドの実装において Vue.js を使っています。 今回、Vue.jsを使った実装をしている中で、複数ページにまたがるフォームの実装で調査した内容を紹介しようと思います。



おはこんにちばんわ! ひさびさなユニファのインフラのすずきです。






